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Formal Matters 

Claims 42-82 are pending after entry of the amendments set forth herein. 

Claims 42-82 were examined: Claims 42-49 and 53-57 were rejected. Claims 50-52 and 58-82 
were withdrawn from consideration. 

Claims 42 3 46 and 49 are amended. The amendments to the claims were made solely in the 
interest of expediting prosecution, and are not to be construed as an acquiescence to any objection or 
rejection of any claim. Support for the amendments to claims 42, 46, and 49 is found in the claims as 
originally filed, and throughout the specification, in particular at the following exemplary locations: 
Specification pages 8, 15, 19, and newly renumbered Figure 6B lanes 2 and 3. Accordingly, no new 
matter is added by these amendments. 

New claims 83- 100 have been added. Support for new claims 83-100 is found in the claims as 
originally filed, and throughout the specification, in particular at the following exemplary locations: 
Claims 83-91: Specification pages 8, 16, 18, 19, 20, and newly renumbered Figure 6B lane 1; Claims 
92-100: Specification pages 8, 15, 17, and 21. Accordingly, no new matter is added by these new 
claims. 

The disclosure has been amended in the specification to address objections noted in the Office 
Action. The specification has been amended on pages 7 and 10. Support for the amended material can 
be found in the claims as originally filed, and throughout the specification, in particular at the following 
exemplary locations: page 7: original claim 1; page 10: original claim 17. 

The drawings have been amended to address objections noted in the Office Action. Original 
Figure 5B and original Figure 6B have been deleted. Original Figures 3C, 3D, 5C and 6C have been 
amended with revised figure numbers with replacement drawing sheets submitted herewith. The 
specification has also been amended to remove reference to the deleted figures. 

Applicants respectfully request reconsideration of the application in view of the remarks made 

herein. 
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The undersigned Applicants' representative thanks Examiner Duffy for the courtesy of an in- 
person interview which took place on November 24, 2003, and which was attended by Examiner Duffy 
and Applicants' representatives Paula A. Borden and Edward J. Baba. 

During the interview, the rejection of claims 42-49 and 53-57 under 35 U.S.C. §112, §102, and 
§103, was discussed. The amendments to the claims reflect the discussions, which took place during the 
interview. 

Drawings 

The drawings filed on 08/02/1999 have been objected to. Replacement figures have been 
submitted herewith, which replacement figures comply with the requirements for formal drawings. 
Withdrawal of these objections is respectfully requested. 

Original Figures 3C and 3D have been amended to comply with the rules. Original Figure 3C 
has been amended to revise the figure numbers to Figures 3C-3Y. Accordingly, original Figure 3D has 
been amended to Figure 3Y, to reflect the renumbering of original Figure 3C. 

Original Figures 5B and 6B have been deleted. Original Figures 5C and 6C have been amended 
to revise the figure numbers to reflect the deletion of Figures 5B and 6B. Replacement drawing sheets 
are provided herewith. The specification has also been amended to remove reference to the deleted 
figures. 

Specification Objections 

The disclosure was objected to because the text references claim numbers. The specification has 
been amended to remove the reference to claim numbers and alternatively, incorporate the language 
from the referenced claims into the text of the disclosure. The inserted material corresponds exactly to 
the text of the original claims that were referenced. Accordingly, no new matter has been added. 
Therefore, the Examiner is respectfully requested to withdraw the objection. 
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Claim objections 

Claims 47, 48, and 49 were objected to under 37 C.F.R. 1.75(c) as allegedly being in improper 
dependent form for failing to further limit the subject matter of a previous claim. 

Without conceding as to the correctness of this rejection, claims 47 and 48 have been canceled 
rendering the objection of these claims moot. In addition, claim 49 has been amended to remove 
reference to specific fragments of gpl90/MSPl. Therefore, the Examiner is respectfully requested to 
withdraw this objection. 

Rejection under 35 U.S.C.S112, first paragraph 
New Matter 

Claims 42-49 and 53-57 were rejected under 35 U.S.C. §112, first paragraph, as allegedly 
containing subject matter which was not described in the specification in such a way as to reasonably 
convey to one skilled in the relevant art that the inventors, at the time the application was filed, had 
possession of the claimed invention. 

Specifically, the Office Action stated that the claims read on a reduction of any AT content in 
any MSP1 nucleotide sequence as compared to any other naturally occurring sequence. As suggested by 
the Examiner, independent claim 42 has been amended to recite "corresponding naturally occurring" in 
order to provide a reference point for comparison of a reduced AT content. 

The Office Action also states that claims 47 and 48 stand rejected because the claims are drawn 
to a method of producing a complete gpl90/MSPl polypeptide wherein the nucleotide sequence further 
comprises an attachment signal or further comprises a signal peptide. Without conceding as to the 
correctness of this rejection, claims 47 and 48 have been canceled and new claims 83 to 100 have been 
added in their place. New claims 83 to 100 are directed to two other variations of the gpl90/MSPl 
protein where either the gpl90/MSPl lacks an attachment signal (claims 83-91) or it lack both an 
attachment signal and signal peptide (claims 92-100). Support for new claims 83- 100 can be found in 
the claims as originally filed, and throughout the specification, in particular at the following exemplary 
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locations: Claims 83-91: Specification pages 8, 16, 18, 19, 20, and Figure 6C lane 1 (renumbered as 
Figure 6B); Claims 92-100: Specification pages 8, 15, 17, and 21. A schematic representation of 
examples of polypeptides recited in the claims is provided in the attached Exhibit 1, which was also 
made available to the examiner prior to the telephone interview on November 24, 2003. Accordingly, no 
new matter is added by these new claims. 

Applicants submit that the new matter rejection of claims 42-49 and 53-57 under 35 U.S.C. 
§ 1 12, first paragraph, has been adequately addressed in view of the remarks set forth above. The 
Examiner is thus respectfully requested to withdraw the rejection. 

Written Description 

Claims 42-49 and 53-57 were rejected under 35 U.S.C. §112, first paragraph, as allegedly 
lacking written description in the specification. Specifically, the Examiner states that the specification 
"fails to describe the complete nucleotide sequences encoding naturally occurring gpl90/MSPl proteins 
corresponding to a representative number of these species, sufficient to describe the genus of nucleotides 
sequences that are modified to produce nucleotide sequences that are 'reduced' in their adenine-thymine 
content." In view of the remarks made below, applicants respectfully traverse this rejection. 

Under MPEP § 2163.02, the standard for determining compliance with the Written Description 
requirement is whether the "specification conveys with reasonable clarity to those skilled in the art that, 
as of the filing date sought, applicant was in possession of the invention as now claimed." See, e.g., 
Vas-Cath. Inc. v. Mahurkan 935 F.2d 1555, 1563-64, 19 USPQ2d 1111, 1117 (Fed. Cir. 1991). 
Essentially, the specification must "clearly allow persons of ordinary skill in the art to recognize that 
they invented what is claimed." See Vas-Cath , 935 F.2d at 1 1 16. 

In rejecting the claims the Office Action states that a representative number of nucleic acid 
sequences have not been provided for other Plasmodium species, therefore a skilled artisan could not 
"envision the detailed chemical structure of the encompassed nucleotide sequences that are used to 
produce the undescribed proteins of other at least 100 Plasmodium species and therefore conception is 
not achieved until reduction to practice has occurred. . .the nucleic acid itself is required." During the 
November 24, 2003 telephone interview, the Examiner indicated that if the nucleic acid sequence for 



17 



Atty Diet. No.:GRUE-003 
USSN: 09/269,874 

MSP-1 of a representative number of species of Plasmodium were known at the time the instant 
application was filed, such would be sufficient to overcome the written description rejection. 

The Applicants submit that the nucleic acid sequences for the MSP-1 protein of a representative 
number of species of Plasmodium were known at the time the present application was filed. For 
example, Chang et al., Exp. Parisatol. 67(1): 1-1 1 (1988) (Exhibit 2) discloses the nucleic acid encoding 
MSP-1 of Plasmodium Falciparum (Uganda-Palo Alto strain); Lewis et al., Mol. Biochem. Parisatol. 
36(3):27 1-282 (1989) (Exhibit 3) discloses the nucleic acid encoding MSP-1 of Plasmodium Yoelii; 
Deleersnijder et al., Mol. Biochem. Parisatol. 43(2):23 1-244 (1990) (Exhibit 4) discloses the nucleic 
acid encoding MSP-1 of Plasmodium Chaubaudi; Del Portillo et al., Proc. Natl. Acad. Sci. 88:4030- 
4034 (1991) (Exhibit 5) discloses the nucleic acid encoding MSP-1 of Plasmodium Vivax (Belum 
strain); and Gibson et al., Mol. Biochem. Parisatol. 50(2):325-333 (1992) (Exhibit 6) closes the nucleic 
acid encoding MSP-1 of Plasmodium Vivax (Sal-1 strain). The Applicants note that the term Merzoite 
Surface Protein 1 (MSP1 or MSP-1) is also referred to in the literature as: Merzoite Surface Antigen 1 
(MSA1 or MSA-1); Plasmodium Major Merzoite Surface Antigen (PMMSA); and Major Merzoite 
Surface Protein Precursor. 

Since the sequence of the of MSP-1 gene of various species of Plasmodium were available, the 
methods disclosed in the present application could be readily applied by one skilled in the relevant art of 
molecular biology to these other MSP-1 genes to produce nucleotide sequences that are "reduced" in 
their adenine-thymine content. Accordingly, the Applicants submit that the written description rejection 
of claims 42-49 and 53-57 under 35 U.S.C. §112, first paragraph, has been adequately addressed in view 
of the remarks set forth above. The Examiner is thus respectfully requested to withdraw the rejection. 

Enablement 

Claim 53 was rejected under 35 U.S.C. §112, first paragraph, as allegedly containing subject 
matter which was not described in the specification in such a way as to enable one skilled in the art to 
which it pertains, or with which it is most nearly connected, to make and/or use in the invention. 
Without conceding to the correctness of this rejection, claim 53 has been canceled in the spirit of 
expediting prosecution. Accordingly, this rejection is rendered moot and the Examiner is thus 
respectfully requested to withdraw the rejection 
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Claims 42-49 and 53-57 were rejected under 35 U.S.C. §112, second paragraph, as allegedly 
indefinite. 

The Office Action stated that claim 42 presents a comparison between two sequences that are not 
structurally defined and therefore, the metes and bounds of the claim cannot be ascertained. 
Specifically, the Office Action notes that the term "complete" and the terms "naturally occurring 
nucleotide sequence" do not help to structurally define the two sequences. In the spirit of expediting 
prosecution, and without conceding as to the correctness of this rejection, claim 42 has been amended to 
remove the term "complete" and in its place add "having an approximate weight if 190 kD" in order to 
describe the gpl90/MSPl protein. In addition and the term "corresponding" has been added to define 
the naturally occurring sequence. These amendments were discussed during the November 24, 2003 
telephone interview. Support for the amendments of claim 42 can be found in the claims as originally 
filed, and throughout the specification, in particular at the following exemplary locations: pages 8, 15, 
19, and Figure 6C lanes 2 and 3 (renumbered as Figure 6B). The amendments of claim 42 have also 
been incorporated in newly presented independent claims 83 and 92. 

Applicants submit that the rejection of claims 42-49 and 53-57 under 35 U.S.C. §1 12, second 
paragraph, has been adequately addressed in view of the remarks set forth above. The Examiner is thus 
respectfully requested to withdraw the rejection. 

Rejection under 35 ILS.C.S102/103 

Claims 42-49 and 53-57 were rejected under 35 U.S.C. § 102(b) or §103 as allegedly 
unpatentable over Holder et al. ((1985) Nature 317:270-273; hereinafter "Holder"). 

The Office Action maintained the rejection from the Office Action dated November 20, 2001, 
and stated that Holder et al. teaches the production of specific fragments of the full length gpl90/MSPl 
from P. falciparum. The Examiner notes that the definition of the term "complete" on page 6 of the 
specification is inclusive of shorter forms, and that the claim does not define the specific sequence for 
comparison of the naturally occurring sequence. 
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As noted above, Claim 42 has been amended to remove the term "complete" and in its place add 
"having an approximate weight if 190 kD" in order to describe the gpl90/MSPl protein and the term 
"corresponding" has been added to define the naturally occurring sequence. 

Holder does not render claims 42-49 and 53-57 obvious, as there is no mention in Holder of a 
method for producing gpl90/MSPl having an approximate molecular weight of 190 kD, much less a 
method of producing gpl90/MSPl, comprising expressing a nucleotide sequence encoding gpl90/MSPl 
in a single expression vector. As stated in the specification, until the instant invention, there was not any 
successful cloning of the coding region for gpl90/MSPl having an approximate molecular weight of 
190 kD. Holder does not disclose a method for solving this problem, nor does Holder suggest any such 
method. Accordingly, Holder cannot render the instant method as claimed obvious. 

Accordingly, the Applicants submit that the rejection of claims 42-49 and 53-57 under 35 
U.S.C.§ 102(b) or 103 has been adequately addressed in view of the amendments to the claims and 
remarks set forth above. Therefore, the Examiner is respectfully requested to withdraw the rejection and 
allow the application to proceed to issue. 
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HI. CONCLUSION 



Applicants submit that all of the claims are in condition for allowance, which action is requested. 
If the Examiner finds that a telephone conference would expedite the prosecution of this application, the 
Examiner is invited to telephone the undersigned at the number provided. 

The Commissioner is hereby authorized to charge any underpayment of fees associated with this 
communication, including any necessary fees for extensions of time, or credit any overpayment to 
Deposit Account No. 50-0815, order number GRUE003. 



BOZICEVIC, FIELD & FRANCIS LLP 
200 Middlefield Road, Suite 200 
Menlo Park, CA 94025 
Telephone: (650) 327-3400 
Facsimile: (650)327-3231 



F:\DOCUMENT\GRUE (Gruenecker, Kinkeldey...)\003\resp OA 8-27-03 final.doc 



Respectfully submitted, 

BOZICEVIC, FIELD & FRANCIS LLP 



Date: 
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Exhibit 1 

U.S. Patent Application No, 09/269,874 
Edward J. Baba (Reg. No. 52,581) 
(650) 833-7731 



U.S.App.No.: 09/269,874 
Group Art Unit: 1641 
Eaminer: P.A. Duffy 

Title: Recombinant Process for Preparing A Complete Malaria Antigen, 

GP190/MSP1 



Protein: gpl90/MSPl - amino acids 1-1639 of SEQ ID NO:3 

Gene: gpl90s 

Description: The complete protein with the attachment signal and signal peptide 

Support: Specification pages 8 and 15 



Protein: gpl90/MSPl - amino acids 1-1621 of SEQ ID NO:3 

Gene: gpl90 sl 

Description: The protein with the signal peptide but lacking an attachment signal 

Support: Specification pages 8 and 15 



Protein: gpl90/MSPl - amino acids 20-1621 of SEQ ID NO:3 

Gene: gpl90 s2 

Description: The protein lacking the attachment signal and signal peptide 

Support: Specification pages 8 and 15 
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Introduction 

Plasmodium falciparum is the causative 
agent of the most serious form of human 
malaria. The surface of the malaria parasite 
undergoes drastic antigenic changes during 
its complex life cycle. The predominant 
surface antigen of the sporozoite produced 
during the sexual cycle of the parasite in the 
Anopheles mosquito host is the circum- 
sporozoite protein (Nussenzweig and Nus- 
senzweig 1985). After injection of the 
sporozoite into the bloodstream of the ver- 
tebrate host and its uptake into hepatocytes 
the circumsporozoite protein is lost (Dan- 



forth et aL 1978). Little is known about sur- 
face antigens of the parasite during the he- 
patic stage of asexual development. How- 
ever, surface antigens of the erythrocytic 
stages of the parasite life cycle have been 
well studied. The major surface antigens of 
the Plasmodium falciparum merozoite, the 
erythrocytic invasive stage of the parasite, 
are derived from a precursor glycoprotein 
with a molecular weight of 185-195,000 
(gpl95) (Freeman and Holder 1983; Hall et 
aL 1983; Holder and Freeman 1984). The 
gpl95 precursor protein is synthesized dur- 
ing the late erythrocytic stage of develop- 
ment and is proteolytically processed to 
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lower molecular weight fragments (Free- 
man and Holder 1983; Hall et al 1984a). 
Three of these fragments of 83,000, 42,000, 
and 19,000 Da have been detected on the 
merozoite surface (Holder and Freeman 
1984). An approximately 80,000-Da pro- 
cessing fragment of the gp 1 95 has been lo- 
calized in the surface coat of the merozoite 
by immunoelectron microscopy (Heidrich 
et al. 1986). 

The genes encoding the gpl95 protein of 
several parasite isolates have been cloned 
and sequenced, revealing sequence poly- 
morphisms among different genes (Hall et 
al 1984b; Holder et al 1985; Mackay et al. 
1985; Weber et al 1986; Tanabe et al 1987; 
Peterson et al 1988). Gpl95-related poly- 
peptides are candidates for a blood stage 
human malaria vaccine. Results of three 
monkey vaccination experiments using 
gpl95-derived immunogens showed partial 
(Hall et al 1984b; Perrin et al 1984) to 
complete protection (Siddiqui et al 1987). 
In each experiment, challenge has been 
with the P. falciparum FUP isolate, but 
only in the latter experiment were the mon- 
keys immunized with g P 195 purified from 
FUP parasites. In order to develop a fully 
protective recombinant polypeptide or syn- 
thetic peptide vaccine based on gpl95 and 
to evaluate the significance of antigenic 
polymorphism of this protein in protective 
immunity, it appears crucial to define the 
structure of the FUP gpl95 gene. 

This study presents the DNA sequence 
of the FUP gpl95 gene and compares its 
translated amino acid sequence to others 
that have been published. In addition, the 
amino acid sequence dimorphism of gpl95 
proteins is correlated to secondary struc- 
ture as predicted by hydropathy analysis. 

Materials and Methods 

Parasites. The Uganda-Palo Alto (FUP) strain of 
Plasmodium falciparum was originally isolated from a 
patient who had contracted the infection in Uganda 
and was hospitalized at Stanford Medical Center, Palo 



Alto, Caufornia, in 1966. In 1967, blood-induced infec- 
tions with this isolate were established in Aotus tri- 
vtrgatus monkeys at Stanford (Geiroan and Meagher 
1%7). The FUP strain was maintained by serial pas- 
sage in Aotus monkeys by Dr. Schmidt of the Southern 
Research Institute, Birmingham, Alabama. In 1970, 
the FUP monkey-passaged strain was obtained from 
Dr. Schmidt and maintained in this laboratory at the 
University of Hawaii by serial passage in Aotus mon- 
keys. Tn 1977, continuous in vitro cultures of the FUP 
strain in human erythrocytes were established at the 
University of Hawaii and have been maintained in this 
laboratory since that time. The FUP parasites used in 
this study were derived from in vitro cultures. 

Isolation of P. falciparum DNA. DNA was isolated 
from cultured FUP strain P. falciparum using the 
Trager and Jensen (1976) culture technique with mod- 
ifications (Siddiqui and Palmer 1981) and standard 
DNA extraction methods (Maniatis et al. 1982). The 
Protoclone bacteriophage k gllO system (Promega Bio- 
tec, Madison, WI, U.S.A.) was used to generate an 
FUP P. falciparum genomic library. FUP DNA (0,3 
M£, 0.1 pmole) was digested with 3 units of the restric- 
tion endonuclease EcoRl (Boehringer Mannheim In- 
dianapolis, IN, U.S.A.) and ligated to 0.5 ^ (0.17 
pmole) £coRI-digested X gtIO DNA (Promega Biotec) 
with 1 unit T4 polynucleotide ligase (Promega Biotec). 
The resultant recombinant phage were grown in Esch- 
erichia coli strain C600AHFL cells, and generated a 
library of 2.5 x JO 6 plaque-forming units in which 87% 
of the phage contained inserts. 

Preparation of synthetic oligonucleotides. Oligonu- 
cleotides used as hybridization probes were synthe- 
sized as pairs of 30-mers overlapping by 10 base pairs. 
Probes were synthesized using methoxy phosphora- 
midites or P-cyanoethyl phosphoramidites on Applied 
Biosystems DNA synthesizers (Foster City, CA, 
U.S.A.) according Lo the manufacturers recommenda- 
tions. The oligonucleotides were radiolabeled by fill-in 
reactions using high specific activity (3000 Ci/mmole) 
5'-[ot- 32 PJdeoxycytidine triphosphates and -deoxy- 
adenosine triphosphates (Amersham, Arlington 
Heights, IL, U.S.A.) and Klenow DNA polymerase j 
(Boehringer Mannheim). Hybridization to phage DNA 
on nitrocellulose filters and washing were carried out 
as described by Ullrich et aL (1984). 

Restriction map analysis and DNA sequencing. A 
gtIO recombinant phage inserts were subcioned into 
pUC plasmids for restriction mapping and into 
M13mpl8 or M13mpl9 phage for DNA sequencing. 
MI3 subclones were sequenced by the enzymatic 
method (Sanger et aL 1977) using a universal M13 
primer (Pharmacia, Piscataway, NJ, U.S.A.) or spe- 
cific primers complementary to the insert sequence. 
Sequencing primers were, synthesized using fJ-cyano- 
ethyl phosphoramidites as described above for oligo- 
nucleotide probes. DNA sequence data were analyzed 
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using computer resources provided by the N I H 
^nsored BIONET National Computer Resource for 
Moleca.br Biology. Pro tein sequences were addition . 
ally analyzed using the computer programs described 
by Paiiletu et al. (1985) and Ootoh (1986). 

Results 

AX gtIO library of FUP strain genomic 
DNA was screened with several synthetic 
oligonucleotide probes based on the con- 
served sequences in the gpl95 genes of the 
WeUcome and Kl strains (Holder et al. 
1985; Mackay et al. 1985). Three X clones 
(W-l, 3-1, and 18-1) contained the entire 
coding region of the strain FUP gpl95 gene 
(Fig. 1) and were mapped using restriction 
endonucleases and subcloned into M13 se- 
quencing vectors. The M13 subclones of 
these three X clones were sequenced 

The complete DNA sequence and amino 
acid translation of the strain FUP gp l95 
gene is presented in Fig. 2. The calculated 
molecular weight of the entire protein is 
196,245. There are 15 potential N- 
glycosylate sites (Snider 1984). The pro- 
tein contains 20 cysteine residues; 19 cys- 
teines are conserved among various iso- 
ates and 13 are located immediately before 
the hydrophobic carboxy terminal region 



The amino acid translation of the FUP 
gpl95 gene has been aligned with the se- 
quences of other isolates (Fig. 3) using the 
computer algorithm described by Gotoh 
(1986). Based on this alignment the gpl95 
sequence can be divided into three types of 
regions. The first type is the conserved re- 
gion (85-100% sequence identity) Con- 
served regions are located at the amino ter- 
minus and the carboxy terminus, as well as 
at internal segments of the protein. The sec- 
ond type of region is the variable repeat 
region which differs greatly among isolates 
in both sequence arid length and is located 
toward the amino terminus. The third type 
of region is designated group-specific be- 
cause it appears to exist in two forms which 
differ greatly in sequence (42-46% se- 
quence identity). Over one-half of the 
gpI95 protein consists of group-specific se- 
quences. Others have previously recog- 
nized this pattern of polymorphism of the 
gpl95 gene in the parasite population and 
r / ^, aUeIic dimor PWsm (Tanabe et 

al. 1987). The FUP protein belongs to the 
same dimorphic group as the gpl95 of the 
Papua New Guinea isolate MAD20, differ- 
ing primarily at the variable repeat region 




map indicates the restriction endTnucTea^c si S Z Z \ K * 0D - ,ower 
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Fig. 2. Nucleotide sequence of the Uganda-Palo Alto (FUP) gpl95 gene. The deduced amino acid 
sequence of the open reading frame is shown below the nucleotide sequence. The potential signal 
peptidase site is indicated with an arrowhead at position 57. The variable repeat region is underlined 
beginning at position 190 and the hydrophobic carboxy terminal sequence is overline'd beginning at 
position 5125. Potential Ar-glycosylatkm sites are denoted by solid circles. 
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Several substitutions can also be noted at 
FUP amino acid positions 344-376 (a se- 
quence which, it shares with the Wellcome 
sequence), along with a 6-base-patr deletion 
relative to MAD20 at position 782, and sev- 
eral single amino acid substitutions scat- 
tered throughout the gene. Nevertheless, 
the FUP and MAD20 gp!95 proteins are 
very similar over most of their sequences, 
in contrast to the more distant sequence re- 
lationship of FUP and Wellcome gpl95. 
The variable repeat region of the FUP 
gp!95 protein is the same as that of the Ma- 
laysian CAMP isolate (Weber et aL 1986). 
In fact, the published partial sequence of 
the CAMP gp195 gene (Weber et aL 1986) is 
identical to the corresponding sequence of 
the FUP gene. 

The extensive sequence differences of 
Ihe group-specific regions of gpl95 genes 
raise the possibility that these dimorphs 
may represent functionally divergent pro- 
teins. Hydropathy profiles, which reflect 
secondary structure as predicted by water 
interactions (Kyte and Doolittle 1982), have 
been used to demonstrate functional con- 
servation among distantly related proteins 
(Simpson et aL 1987). We compared the hy- 
dropathy profiles of the FUP and Wellcome 
dimorphic gpI95 proteins using an adapta- 
tion of the method of Kyte and Doolittle 
(Kyte and Doolittle 1982; Pauletti et aL 
1985) to determine whether these proteins 
would be predicted to differ in structure 
(Fig. 4). As anticipated, regions of con- 
served sequence displayed identical hy- 
dropathy patterns.. The different repeat re- 
gions also differed in hydropathy. How- 
ever, regions containing dimorphic or 
group-specific sequences maintained very 
similar hydropathy patterns with only sub- 
tle changes in degree of hydrophilicity or 
hydrophobicity. A few regions where water 
interactions differed were at positions 900- 
1000 (FUP) and 1460-1640 (FUP), which 
correspond to insertions into the FUP gene 
thai are lacking in the Wellcome gene. The 
overall conservation of the hydropathy pat- 



tern indicates that despite extensive amino 
acid differences the basic structure of the 
gpl95 protein has been conserved among 
Plasmodium falciparum isolates. 

Discussion 

The determination of the complete DNA 
sequence of the FUP isolate gpl95 gene ex- 
tends our understanding of the degree of 
polymorphism of the major merozoite sur- 
face coat protein of Plasmodium falci- 
parum. Its similarity to the previously re- 
ported gpt95 sequence of the Papua New 
Guinea isolate (Tanabe et ai 1987) indi- 
cates that outside of the variable repeat re- 
gion the polymorphism of this antigen is not 
extreme. This supports the proposal of oth- 
ers that the parasite population may be 
represented by two allelic groups, or 
dimorphs, which can undergo recombina- 
tion during the sexual cycle in the mosquito 
vector to produce hybrid proteins, such as 
that observed for the Thailand KI isolate 
(Tanabe et aL 1987). The identity between 
the partial sequence of the Malaysian 
CAMP isolate gpl95 gene (Weber et aL 
1986) and the corresponding region of the 
FUP gene is surprising since these isolates 
were obtained from distinct geographical 
areas. Nearly identical DNA sequences (8 
nucleotide differences) were reported for 
the gp!95 genes of the MAD20 and FC27 
isolates (Tanabe et aL 1987; Peterson et aL 
1988); however, these two isolates were 
both obtained from Papua New Guinea and 
thus may be derived from the same parasite 
population. The FUP and CAMP isolates 
were also found to be similar in sensitivity 
to several antimalarial drugs in vitro (Sid- 
diqui et aL 1972) although they differed in 
drug sensitivity in vivo (Degowin and Pow- 
ell, 1965). The genetic relationship between 
parasites of these two isolates is being in- 
vestigated further using molecular probes 
specific for several genetic loci of P. falci- 
parum. 

Sequence comparisons between gpl95 
genes of different isolates indicate that 
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amino acid sequences of the dimorphic pro- 
teins can differ by greater than 50% in cer- 
tain parts of the gene. Given such extensive 
sequence dissimilarities, we examined 
whether structural differences as reflected 
by hydropathy profiles could be observed. 
A comparison of hydropathy profiles of the 
FUP and Wellcome gp!95 proteins, which 
represent the dimorphic groups, indicated 
that outside of the variable repeat regions 
the hydropathy patterns of the two proteins 
were very similar. These results suggest 
that the overall structure of the gpl95 pro- 
teins of different P. falciparum isolates is 
conserved. Such structural conservation 
suggests that the group-specific regions of 
gpl95 proteins share the same function. 
However, this function may be more de- 
pendent on the overall conformation of the 
protein than on its primary sequence. Al- 
ternatively, it is possible that group-specific 
regions may be involved in a function 
which can be carried out using two different 
pathways, such as the interaction of the 
parasite with different receptors on the host 
ceil, as has been suggested by others (Ta- 
nabe et al 1987). 

In designing an effective malaria vaccine, 
it is essential that the level of antigenic 
polymorphism of candidate antigens be 
carefully assessed. It is generally recog- 
nized that highly variable, isolate-specific 
sequences such as the variable repeats of 
gp!95 are less attractive for vaccine devel- 
opment than conserved sequences. Several 
regions of the gpl95 genes are highly con- 
served among all of the sequences that have 
been studied. These conserved regions are 
located on either side of the variable re- 
peats at the amino terminal end of the pro- 
tein, at the carboxy terminal region, and 



between the two large group-specific re- 
gions (Fig. 3). These conserved regions 
would be located on two to three distinct 
processing fragments on the mature mero- 
zoite surface (Lyon et al 1986) and may be 
important in the merozoite invasion pro- 
cess. However, it remains to be shown 
whether epitopes contained in these regions 
are more relevant to immunity than those in 
other, less-conserved regions. Both con- 
served and group-specific regions of this 
protein have a generally hydrophilic char- 
acter and exposure of these regions on the 
protein surface would allow them to be rec- 
ognized as antigenic determinants (Hopp 
and Woods 1981). The extensive amino 
acid substitutions in group-specific regions 
of the gpl95 protein make it likely that pro- 
teins of different groups would be antigen- 
ically distinct. While immunity developed 
against these regions would probably be 
group-specific, the evidence that there are a 
limited number (possibly only two forms) of 
group-specific regions make it feasible to 
consider including both of these regions in a 
recombinant vaccine. 

Information on the primary structure of 
gpl95 in the FUP strain enables us to re- 
evaluate vaccination experiments in which 
monkeys immunized with antigen from an- 
other strain of P. falciparum were chal- 
lenged with parasites of the FUP strain. 
While it must be recognized that these ex- 
periments differed in experimental detail, it 
is still informative to discuss them in light of 
this new information. Hall etal (1984b) im- 
munized Saimiri monkeys with monoclonal 
antibody-purified pl90, the gpl95- 
equivalent of the Kl (Thailand) strain of the 
parasite, and challenged these animals with 
the FUP strain. Two of three immunized 



Fig. 3. Comparison of the amino acid sequences encoded by the gpi95 gene of the FUP-Uganda, 
MAD20-Papua New Guinea (Tanabe et al 1987), Wellcome. Lagos (Holder et al 1985), and Kl- 
Thaiiand (Mackay et al 1985) strains of P. falciparum. Alignment was done using the Gotoh algorithm 
(Gotoh 1986). Shared sequences are indicated by blank spaces and gaps are indicated by periods. 
Conserved, variable repeat, and group-specific regions are indicated as the respectively designated, 
overlined sequences. 
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animals were partially protected from par- 
asite infection (5-10% parasitemias) while a 
third animal reacted similarly to the unim- 
munized controls and required drug treat- 
ment. Siddiqui et aL (1978), using FVO 
(Vietnam) strain parasites for immuniza- 
tion, also obtained only partial protection 
of immunized Aotus monkeys challenged 
with FUP strain parasites. These results 
contrast with the complete protection ob- 
tained in Aotus monkeys immunized with 
purified FUP gpl95 and challenged with 
parasites of the same strain (Siddiqui et aL 
1987). The results of the strain Kl/strain 
FUP heterologous vaccination experiment 
are particularly interesting because except 
for a 36-bp deletion in the repeat region, the 
first 375 amino acids of Kl and FUP gp!95 
are 99.9% identical (differ by 16 base pairs, 
Fig. 3). The repeat region of the challenge 
FUP strain is made up of repeats which 
are also found in the Kl polypeptide 
[SAQ(SGT)n], the only differences being 
the larger number of consecutive SGT re- 
peats (n) and overall greater length of the 
FUP repeat region and the absence of a 
third type of repeat unit (SGP). However, 
beyond residue 375 of the Kl sequence, the 
sequence similarity between Kl and FUP 
proteins drops below 50%. While partial 
protection may have been achieved in this 
experiment by immunity to conserved 
epitopes located within the amino terminal 
region of the polypeptide, complete protec- 
tion may require immunity to noncon- 
served epitopes located beyond this region 
and/or to conformational epitopes ex- 
pressed by the longer repeat region of the 
FUP polypeptide. The involvement of non- 
conserved epitopes in immunity is sup- 
ported by the Findings of Cheung et aL 
(1986), in which Saimiri monkeys immu- 
nized with a conserved amino terminal pep- 
tide of gp!95 were also incompletely pro- 
. tecled from challenge with the malaria par- 
asite. Most recently Patarroyo et aL (1987) 
have shown that a synthetic peptide corre- 
sponding to a moderately conserved region 



at the amino terminal end of gpl95 (resi- 
dues 43-53) contributed to the development 
of protective immunity but could not alone 
protect Aotus monkeys against malaria. 

The mounting evidence linking the gpl95 
molecule to protection in the monkey 
model (Hall et aL 1984b; Perrin et aL 1984; 
Cheung et aL 1986; Patarroyo et aL 1987; 
Siddiqui et aL 1987), along with the growing 
number of characterized gpl95 genes of dif- 
ferent P. falciparum isolates, provides a 
powerful basis for development of a blood 
stage malaria vaccine. Information on the 
structural relatedness of the different gpl95 
genes permits a rational design of vaccina- 
tion experiments which simulate the poly- 
morphism encountered in nature. It also al- 
lows us to evaluate the possible need for a 
multivalent gpl95 vaccine to achieve clini- 
cal immunity in a susceptible population. 
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Introduction 

The precursor to the major merozoite surface 
aniens (PMMSA) has been proposed as a Zn- 
didate for a vaccine directed against the asexual 
erythrocyte stage of malaria [1]. This polypep- 
tide has been.dentified in human [2,3], simian [4] 
and rodent [5-7] malarial species ranging in 
iecular mass from 185 to 250 kDa. The precursor 
|s synthesised during intraerythrocytic develop- 

ZL ?^. parasite ' and « Processed into a 
number of d.screte fragments during merozoite 

ShT [2 ^ ll] - PMMSA dements hale 
zoite [12-15] and a possible role for the protein 
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TH. P J^flf l Cd W er y throc y^ evasion [16]. 
The PMMSA has a size of 230 kDa in the ml 

T£T Plasmodiu ™ yodii (Py230) [51. Pur- 
ified Py230, and a monoclonal antibody which Ve- 

lTT S J. C ' teimiml e P it °P e on th * Protein, 
have both been shown to protect mice from cha^ 
lenge infection with P. yoelii [5,17,18]. A role for 
cell-mediated immunity has also been implicated 

HQ1 ^ reSP °f C ° bSerVCd to the whoIe ^tigen 
119]. The analogous protein in the human malaria 

m ^ ^ f ±P arum has a size in the range 

1 V 22 J' °^ynthetic peptides derived from this 
an igen [23,24], have been used to producTpar- 
tial or complete protection against challenge in- 
techon m non-human vaccine trials. Further- 
more, a polymeric synthetic hybrid protein, based 
upon a mixture of three synthetic peptides in- 
cludmg a denvative of Pfl95. has been found to 
induce protective immunity in humans [25] These 
results reinforce the potential of the PMMSA as 
a candidate for a vaccine against the malarial as- 
exual blood stage, and emphasize the need to de- 
velop an experimental model system for this an- 
tigen m order to analyse in more detail the 
mechanisms involved in protective immunity 
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The 3' portion of the gene for the P. yoelii 
17XL 230-kDa protein was recently published 
[26]. This study describes the cloning and se- 
quencing of the complete Py230 gene from the 
YM strain of P. yoelii, and a comparison of the 
sequence with the P. falciparum analogue, Pfl95. 

Materials and Methods 

Preparation of parasite genomic DNA. CD1 mice 
were infected with the YM strain of P. yoelii, and 
lymphocytes removed on day 1 post-infection by 
subcutaneous injection with 150 ui of 40 mg mJ~ l 
cyclophosphamide in 0.85% NaCl [27]. Parasi- 
tised blood was collected on day 3 post-infection, 
when parasitaemias averaged 50%, and parasite 
DNA was prepared as previously described 
[28.29]. 

Construction of genomic DNA libraries. Three li- 
braries were constructed from the P. yoelii gen- 
omic DNA. All restriction endonuclease digests 
were under conditions recommended by the 
manufacturer. Library (1): parasite DNA was di- 
gested with mung bean nuclease (Pharmacia) in 
35% formamide as described [30], and ligated to 
phosphorylated EcoRl linkers (Pharmacia). After 
digestion with £coRI (Amersham), excess link- 
ers were removed using a NACS PREP AC col- 
umn (BRL). The DNA was ligated into the 
EcoRl site of \gtll (Stratagene), and a genomic 
library constructed by in vitro packaging (Stra- 
tagene). Library (2): parasite DNA was digested 
with Dral (NBL) and ligated into the Smal site 
of pUC9 (Pharmacia) treated with calf intestinal 
alkaline phosphatase [28]. A genomic library was 
constructed by transformation of Max Efficiency 
DH5a competent cells (BRL). Library (3): para- 
site DNA was digested with EcoRl and a gen- 
omic library constructed, as above, using £coRI- 
cut pUC9. 

Screening of genomic libraries. Libraries were 
screened using synthetic oligonucleotides made on 
a Biosearch Sam One DNA synthesiser (New 
Brunswick). Library (1) was screened using 
probes A (a 26-mer, 5'-GAAGGTAATA- 
CATGTGTAGAAAATAA-3' corresponding to 
nucleotides 1807-1832 in ref. 26) and B (a 26-mer, 



5'-TTTCTTTAACAAGAGAAGAGAA- 
GCTG-3' corresponding to fnucleotides 285-310 
in ref. 26); library (2) was screened using probe 
B, and library (3) using probe C (an 18-mer, 5'- 
AAACAAAGATGCTITAAG-3' correspond- 
ing to nucleotides 2685-2702 in Fig. 2). Bacteri- 
ophage plaques and bacterial colonies were lifted 
on to Hybond-N nylon filters (Amersham) fol- 
lowing the manufacturer's instructions, and 
screened with 32 P-labelled oligonucleotides [31] as 
previously described [32], Bacteriophage \ and 
pUC9 plasmid DNA was isolated from positive 
clones as described [28,33]. 

DNA sequencing. The insert of the \gtll recom- 
binant was subcloned into the EcoRl site of 
pUC9. Regions of the genomic DNA pUCV 
clones were sequenced in both directions by plas- 
mid printing following the dideoxy chain termin- 
ation method [34], according to the Sequenase kit 
(USB) protocol. Sequences were analysed on the 
Wellcome Biotech computer system with the aid 
of the programs I ALIGN (National Biomedical 
Research Foundation) and DIAGON [35]. 

Results 

Isolation of P, yoelii YM Py230 clones. The 3' 
portion of the gene for the Py230 antigen from P. 
yoelii 17XL was published recently [26]. Two oli- 
gonucleotide probes, A and B, were synthesised. 
corresponding to nucleotides 1807-1832 and 
285-310, respectively, from the published se- 
quence. These regions were chosen as they pos- 
sessed high nucleic acid homology to the corre- 
sponding sections of the Wellcome Pfl95 sequence 
[36]. Approximately 7x10* phage from library (1) 
were screened with probe A, and 10 positive 
clones were detected. After additional rounds of 
screening with probes A and B, one recombi- 
nant, \PyM4.3, remained positive. \PyM4.3 con- 
tained an EcoRl insert of 4.3 kb (Fig. 1). 

Probe B was used to screen approximately 
4X10 4 recombinants of library (2). One positive 
clone was isolated, pPyD1.7, which was found to 
possess a Dral insert of 1.7 kb (Fig. 1). From se- 
quence analysis, oligonucleotide probe C was 
synthesised and used to screen library (3). Ap- 
proximately 2xl0 4 recombinants were probed, 
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and 20 positives isolated. A genomic DNA clone, 
pPyE5.9, that was purified, contained a 5.9-kb 
£coRI insert (Fig. 1). 

A fragment from \PyM4.3 was used to probe 
an RNA blot of total P. yoelii YM RNA. A 7.3- 
kb RNA species was detected (data not shown); 
the size expected for a transcript encoding a 230- 
kDa protein (including 5' leader and 3' non-cod- 
ing sequences). 

Nucleotide sequence of the P. yoelii YM Py230 
gene. The \PyM4.3 insert was subcloned into 
£coRI-cut pUC9. The DNA sequence was deter- 
mined for a region overlapping the inserts from 
the three recombinant clones, spanning 5775 nu- 
cleotides (Fig. 2). A methionine start codon at 
nucleotide 190 is followed by a single open read- 
ing frame of 5316 bp terminating with the first stop 
codon at nucleotide 5505. The A+T content is 
high, with an average of 69% within the coding 
region and 85% for the 5' and 3' untranslated se- 
quences. This is consistent with levels found for 
the entire P. yoelii genome [37]. The open read- 
ing frame encodes a polypeptide of 1772 amino 
acid residues with a calculated size of 197 kDa, 
thus smaller than the 230 kDa determined using 
sodium dodecyl sulphate-polyacrylamide gel elec- 
trophoresis. A number of other malarial antigens 
[36,38-41], however, also exhibit such discrep- 
ancies, a property thought to be related to the re- 
petitive regions of these polypeptides. The Py230 
amino acid sequence contains six tandem repeats 
of the tetrapeptide Gly-Ala-Val-Pro, which may 



account for a similar discrepancy in this antigen. 
At the N-terminus of the polypeptide is a puta- 
tive signal peptide of 19 amino acids, and at the 
C-terminus a potential 18-amino-acid hydropho- 
bic membrane anchor. The sequence contains 20 
cysteine residues, of which 10 are situated within 
the C-terminai 110 amino acids. Of the remaining 

10 cysteines, 8 appear to be positioned as 4 pairs, 
based upon their linear proximities. There are also 

11 potential N-glycosylation sites (Asn-X-Ser/Thr, 
where X can be any amino acid with the probable 
exclusion of proline [42]) scattered throughout the 
molecule. The sequence of the C-terminal 2310 
nucleotides is identical to that published for the 
Py230 gene from P. yoelii 17XL [26]. The viru- 
lent YM and 17XL strains were originally derived 
from a common ancestor, the uncloned avirulent 
isolate, 17X [43,44]. 

Comparisons between the Py230 and Pf295 se- 
quences. The P. yoelii YM Py230 amino acid se- 
quence was aligned with the Wellcome strain 
Pfl95 sequence [36] by computer analysis (Fig. 3; 
a revised Wellcome Pfl95 sequence was used 
which has been submitted to the GenBank and 
EMBL databases). An overall homology of 31% 
was determined, with particular regions exhibit- 
ing as much as 60% conservation, 14 of the 20 
cysteines within the Py230 sequence are located 
at positions similar to those in Pfl95, including all 
10 cysteines at the C-terminus. None of the /V- 
glycosylation sites present in either polypeptide, 
however, are conserved. This may reflect the ob- 
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Fig. 3. Comparison of the amino acid sequence of Py230 with Pfl95 from the Wellcome strain (36). Alignment was carried out 
usmg the I ALIGN program (National Biomedical Research Foundation). Positions of cysteine residues that are conserved be- 
tween sequences are indicated by filled circles, and those that are not conserved by open circles. Positions of potential tf-glyco- 
syiation sites are denoted by open diamonds. The positions of previously determined Pfl95 blocks based upon conservation of 
amino acids between different Pfl95 alleles [47] are shown. Conserved blocks are boxed by unbroken lines, semi-conserved blocks 

by broken lines, and variable blocks remain unboxed. 
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servation that the Py230 antigen is lacking in gly- 
cosylate [8], and that the glycosylate identi- 
fied in the Pfl95 may be confined to a glycolipid 
anchor identified at the C-terminus [45,46]. The 
Py230 sequence, when compared to Pfl95, pos- 
sesses two large 'inserted' blocks of amino' acids 
in the central region and near to the N-terminus 
of the polypeptide. There is also a large deletion 
of rescues following the putative signal peptide, 
which spans the tripeptide repeats (Ser-X-X) 
present in the Wellcome Pfl95. 
The Py230 amino acid sequence was compared 
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to the two PQ95 allelic variants, from the Well- 
come and MAD20 [47] strains of P. falciparum 
using the D1AGON computer program of Stadeii 
[35J (Fig. 4), Regions of conservation between 
Py230 and the Wellcome Pfl95 allele were found 
to be similarly conserved when the MAD20 Pf 19* 
allele was compared. The Py230 and Wellcome 
Pfl95 sequences have also been compared to the 
published sequence for a portion of the 200-kDa 
PMMSA from the human malaria Plasmodium 
vivax [3] (data not shown). This aligns with the 
region 117-794 amino acid residues of Py 230. h 
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Fig. 4. Comparisons of the Py230, Wellcome Pfl95 [361 and 
MAD20 Pfl95 [47] amino acid sequences. Analysis was by the 
DIAGON program of Staden [35], using a proportional al- 
gorithm (proportional score 132, span length H). The axes of 
the plots represent the appropriate amino acid sequences 
numbered from their N- to C-termini. 
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was found that conservation of amino acids be- 
tween the PMMSAs of any two of the species 
closely moored the homologies observed for each 
or the other species comparisons. 

Discussion 

The data presented here describe the cloninc 
and sequencing of the complete gene for the 230- 
icua merozoite antigen from P. yoelii YM The 
Pu 0t , e oc fi" 3 structure c| osely resembling that of 
the 195-kDa PMMSA of P. falciparum, with sim- 
•lar putative signal peptide and membrane an- 
chor sequences, and extensive amino acid hom- 
ology throughout the length of the molecule. 
The level of similarity consolidates the use of the 

22?*? 311 e *P erimental model system for the 
PMMSA. 

Certain protein sequences are conserved be- 
tween the PMMSAs of the three malarial species 
P. yoelu, P. falciparum and P. vivaX : Malaria 
parasites are thought to fall into three evolution- 
ary groups, based upon genomic DNA base com- 
position and sequence similarities between ma- ' 
lana genes [48,49], and each of the above species 
situates in a separate group. The conservation of 
ammo acids observed between the PMMSAs of 
the three evolutionary distant species thus ar- 
gues that there are certain constraints placed upon 
these regions of the polypeptide. 

The P. falciparum PMMSA sequence has pre- 
viously been divided into 17 blocks based upon 
conservation of amino acids between different 
Pfl95 alleles [47], These regions have been class- 
ified as either conserved (more than 87% homol- 
ogy), semi-conserved (areas of patchy homology) 
or variable (extensive divergence). As shown by 
the alignment in Fig. 3, however, these divisions 
may not necessarily reflect the actual conserva- 
tion of ammo acid sequences between malarial 



279 

species. Certain 'variable* blocks from the Pfl95 
allelic analysis, such as amino acids 385-608 from 

^ol S" 1 ? T"*' Mn be "« to contai « 
regions of close homology when Py230/Pfl95 

comparisons are made. By contrast, 'conserved' 

blocks can possess areas of comparatively little 

homo ogy. DIAGON analysis indicates that any 

homology observed between Py230 and Pfl95 

S^rS? b 1 of n °c dUCed in """Papons between 
^ different Pfl95 alleles from the Wellcome and 
MAD20 strains, even within the 'variable' blocks 
(Fig. 4). This result suggests that the interspecies 

the PMMSA regions that are associated with es- 
sential structural and/or functional roles 

■ T^FtE? 311111,0 add set l ue nce can be divided 
into 22 different blocks based upon interspecies 
conservation (Fig. 5). The blocks are classified 
thus; a) conserved (possessing greater than 45% 
homology); (b) semi-conserved (between 20 and 
45% homology); a „d ( c ) variable (less than 20% 
homology and frequently containing large dele- 
tions or insertions of amino acids). All of the 
conserved cysteine residues are found within con- 
served blocks, thus suggesting important struc- 
tural functions for these amino acids, the semi- 
conserved blocks often contain sequences of low 
homology interspersed with small regions of high 
conservation, and' the latter sequences again 
probably s lg nify amino acids of physical impor- 
tance to the protein. The conserved and semi- 
conserved blocks are essentially a-helical in 
structure with the variable regions consisting of 
randomly coiled hydrophilic amino acids (data not 
shown). Such data suggest that the conserved 
Clocks represent sequences internal to the pro- 
tein, wuh the variable regions positioned on the 
extremities at the apices of adjacent a-helices. 
This would allow for the sequence variability ob- 
served. Exceptions to the rule are the repeat re- 
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gions, Gly-Ala-Val-Pro and Ser-X-X, present in 
variable blocks of Py230 and Pfl95, respectively. 
These are hydrophobic in each case, suggesting 
that they may be positioned on the interior of the 
protein. It has been proposed that tandem re- 
peats in malaria antigens are likely to possess 
some major role, perhaps playing a critical part 
in the specific functioning of the antigen, or act- 
ing as an immunological decoy directing immune 
responses away from more functionally impor- 
tant regions of the polypeptide [49,50]. The as- 
sumed deep-seated positions of these structures, 
however, combined with the fact that repeats 
present in the PMMSA of one species are lacking 
in that of the other species, argues against both 
of these suggested roles. Indeed, a Pfl95 antigen 
has been described that is totally lacking in re- 
peats [51). Whether the tandem repeats of the 
PMMSA possess an important function thus re- 
mains unclear. 

The external positioning of the variable re- 
gions within the PMMSA makes them potentially 
highly immunogenic. The very variability of such 
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Introduction 

ih^/™ CI !? ° f itS intraer ythrocytic development 
he Plasmodium parasite undergoes several rounds 
f nuclear division and forms a number of Individ 

el, TT S - After mptUre ° f * e infec <«< red 
releaSed int0 the "lood- 

tream and rap.dly reihvade new erythrocytes 
S.nce merozoites represent the only s7ageTn he 
"ythrocyuc cycle that is directly exposfd o he 
"mune system, they are considered io be impor 
»nt targets for vaccination [1]. P 
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The major surface antigens of the merozoite are 
denved .from a high-molecular-weight precursor 
glycoprotein, that is often referred to as PMMS A 
(precursor ,o the major merozoite surface amJ 
gens). This ant.gen is synthesized late in the ery- 
throcyte cycle and is subsequently proces ed fnTo 

ST °l Sm H a,,er P rotei " f -gments that are " - 
octated w,th the surface of the mature merozoite 

by SDS nl ' °! PreCUrSOr< as determined 
by SDS-poIyacrylam.de gel electrophoresis (SDS- 
PAGE). vanes from 185-205 kDa in Plasmodium 
wiTT [8] 10 230 ^odiumZeUi 

wSu W pSJS- (/> ' chabaudi 
HO-llJ. PMMSA shows considerable size and 

antigenic polymorphism between different isolates 

P^A ,Pm [U] and P - c - chaha "di [13]. The 
PMMSA gene has been cloned and sequenced for 
a ( number of different plates of P falciparum 

haTthe SXES™ ° f * eSe SeqUenCes shows 

are h£hi gCnC COnsists of b,ocl « that 

are h g h , y conserved and Wocks 

C b K'T is °'^ f 6]. Within each Sri- 
able block only two distinct sequences have bee" 
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found with the exception of the variable block 
closest to the amino-terminus where three ver- 
sions have been found [22]. Furthermore most 
sequences contain near the aminoterminal end a 
region of tripeptide tandem repeats that is highly 
polymorphic between different strains. 

Partial sequence information is also available 
on the PMMSA gene of Plasmodium vivax [23] 
and the complete sequence of the P. y. yoelii YM 
PMMSA gene has recently been published [24]. 

Although several research groups have been 
able to induce partial or complete protection with 
purified PMMSA [25-28] or synthetic oligopep- 
tides derived from it [29-31], the exact mecha- 
nism by which this protective immunity operates 
is still unclear and probably involves both hu- 
moral and cell-mediated immunity [32-35]. As a 
first step towards developing a mouse model sys- 
tem in which these questions might more easily 
be addressed we have cloned and sequenced the 
PMMSA gene of the rodent malaria parasite P. 
c. chahaudi IP-PC 1 and also established a crude 
epitope map of PMMSA. 

Materials and Methods 

Parasites. Strain IP-PC I of P. c. chahaudi [36], 
obtained from Dr. P. Falanga (Institut Pasteur 
Paris) was cloned by limiting dilution. One clone, 
termed IP-PC l/C was used for the sequence anal- 
ysis described in this study. Strain IP-PC 1 is a 
rat-adapted strain that was transferred to mice 
where it induces fairly synchronous infections. 
IP-PC 1 schizont infected erythrocytes do not se- 
quester. Several attempts to mosquito-transmit IP- 
PCI or IP-PC1/C were unsuccessful. Parasites 
were grown in OF1 outbred mice (iffa Credo), 
kept in an inverted nycthemeral cycle for diurnal 
schizogony. 

Monoclonal antibodies. . Hybridomas secreting 
PMMSA specific monoclonal antibodies (mAbs) 
1-7, 9-10, 50 and 52 were generated in this lab- 
oratory. Spleen cells from hyperimmune BALB/c 
mice were fused with myeloma cell-line NSO/U 
[37]; Hybridoma cultures producing antiplas- 
modial antibodies were identified by indirect im- 
munofluorescence (IIF) and cloned by limiting di- 
lution. Ascites fluid from pristane (2,6,10,14 tetra- 



methylpentadecane; Aldrich)-primed mice was 
used as the source of mAbs. PMMSA-specific 
mAbs were identified on the basis of their sur- 
face reactivity with purified merozoites in suspen- 
sion (IIF) and immunoprecipitation of an approx- 
imately 250-kDa antigen. Mice, were made hyper- 
immune by nivaquine treatment at a parasitemia of 
5-25% followed by two more parasite challenges 
(10 7 infected red cells per mouse) at three-week 
intervals. 

PMMSA-specific mAbs 12.3, 12.11, 12.12. 
12.15, 12.17, raised against a cloned P, ( . 
chahaudi isolate (isolate CB) [38] were kindl\ 
provided by D. Walliker (University of Edinburgh. 
U.K.). PMMSA-specific mAbs H98 and HK'hi 
were a generous gift from M. Hommel (University 
of Liverpool, U.K.) and had been raised agains! 
clone PC-7 of the P. c. chahaudi isolate IP- PC » 
[39]. 

Preparation of cDNA and genomic libraries 
Parasitized blood was collected when infection 
reached 30-50% and parasites were predomi- 
nantly at schizont stage. Leukocytes were re- 
moved from infected blood as described elsewhere 
[40]. RNA was extracted by homogenization of 
saponin-liberated schizonts in 6 M guanidiniuni- 
HCI/O.l M Na-acetate t pH 5.2 and centrifugation 
through a 4.8 M CsCl/10 mM.EDTA (pH 
cushion at 35 000 rev./min in a Beckman SW 4! 
rotor for 16 h. Poly (A) + RNA was selected r> 
oligo(dT) cellulose chromatography and cDNA 
prepared according to the Amersham cDNA syn- 
thesis kit protocol. The cDNA was subsequently 
methylated with EcoRl methylase and ligated io 
phosphorylated EcoRl linkers with T4 DNA lip- 
ase. This mixture was then cleaved with Ei<*R\ 
and fractionated on a Bio-Gel A-50m column 
(Biorad). Fractions containing cDNA molecuf^ 
> 500 bp were ligated to dephosphorylated \gil J 
EcoRl arms and packaged in vitro (Packagenc. 
Promega). 

For the construction of the genomic li- 
brary, EcoRl/Xhal cleaved genomic DNA ua* 
ligated into dephosphorylated lambda GEM-2 
EcoRl/Xhal arms (Promega) and packaged in 
vitro. 

Screening of the libraries. Screening of recombi- j 
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I of recombi- 



nant phages with monoclonal sera was done ac- 
cording to Huynh et al. [41]. For hybridization 
screenings DNA probes were radiolabeled using 
ihe 'Multiprime DNA Labelling System* (Amer- 
sham) and hybridized to plaque blots on Hybond- 
\ (Amersham) according to standard protocols 
[42J. Final washes occurred under stringent con- 
ditions (10 min at 65°C in 15 mM NaCl/L5 mM 
Wcitrate). 

Sequencing. Inserts of selected clones were sub- 
cloned into Ml3mpl8 or pUC 18 sequencing vec- 
tors. Plasmid or MI3 subclones were then gen- 
erated that contained progressive unidirectional 
deletions of each insert by controlled exonuclease 
M digestion (Erase-a-Base, Promega). Sequenc- 
ing was done on either single-stranded (Ml 3) 
or double-stranded (pUC18) templates by the 
Jideoxy chain-termination method of Sanger et al 
143] using modified T7 DNA polymerase (Seque- 
nase, USB). Both strands were entirely sequenced 
rn the coding and 3' untranslated areas and par- 
tially in the 5' untranslated region. Computer- 
assisted storage and analysis of sequence data was 
facilitated using the PC/GENE software package 
ifntelligenetics). 

Epitope mapping. Plaque-purified recombinant 
cDNA Agtll phages 72, 100, 46 and 452 ex- 
pressing parts of P. c. chahaudi PMMSA were 
ioothpicked from suspensions containing approx- 
imately 10 7 plaque forming units ml' 1 onto the 
sop agarose layer (containing approximately 10 9 
Escherichia coli Y1090 [41] bacteria) of 90 mm 
diameter LB agar plates <100/xg ampicillin ml" 1 ). 
Plates were then incubated at 42°C for 3.5 h, sub- 
sequently overlaid with a dry nitrocellulose fil- 
ter disk, saturated previously with 10 mM iso- 
propyl /?-D-thiogalactopyranoside (IPTG) in wa- 
ter. and incubated for an additional 3.5 h at 
37°C. Nitrocellulose membranes were then sat- 
urated overnight with TBST buffer (10 mM Tris- 
HCr pH 8.0/150 mM NaCI, 0.05% Tween-20) con- 
taining 1% BSA and then incubated for 1 h with 
TBST containing mouse serum antibody or as- 
cites fluid at a 1/200 dilution. Filters are washed 
several times with TBST and then incubated for 
■ h with TBST containing affinity-purified alka- 
line phosphatase conjugated goat ami mouse IgG- 
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antibodies (dilution .1/1000). After a final round 
of washes substrate (0.05 mg ml" 1 5-bromo-6- 
chloro-3-indolyl acetate, 0.1 mg ml" 1 nitro blue 
tetrazohum in 10 mM MgCl 2 /100 mM NaCl/100 
rnM Tris-HCI, pH 9.5) is added to the filters and 
the enzymatic reaction is stopped after 5-15 min 
by rinsing the membranes with water. The rela- 
tive position of cDNA expression clones 72, 100 
and 46 in the open reading frame was determined 
by sequence analysis of the ends of their inserts. 
Clone 452 was positioned by restriction mapping. 

Results 

Cloning strategy. cDNA expression libraries 
(Agtl i) were screened with a mixture of PMMSA- 
specific mAbs. Several positive clones were de- 
tected among which clones 100 and 46 were 
selected for sequence analysis. Clone 46 was 
later shown to contain a cloning artefact. This 
involved the fortuitous ligation of a PMMSA- 
specific cDNA molecule to an unrelated cDNA 
molecule (dotted line in Fig. 1). This conclusion 
was based on Southern blot analysis and compar- 
ison with the P. y„ yoelii YM PMMSA sequence 
and was confirmed by sequencing genomic clone 
RX4. Sequence data from the nonspecific part of 
clone 46 were discarded for this study. Clone R16 
was obtained by rescreening cDNA libraries with 
a radiolabeled DNA fragment that originated from 
the 3' end of the clone 100 insert. 

Clone 100 insert hybridized on Southern blots 
of EcoRl cleaved genomic DNA to a single 
band of approximately 12 kb, whereas a sin- 
gle 5.4-kb band was detected on Southern blots 



Xflobases 



TAA 



R1G 



too 



46 



Fig- 1. Cloning strategy for the pi 99 gene. Clones 46 100 
and R16 are Agtl I cDNA-ex press ion clones. Clone RX4 is a- 
genomic A clone. The section or clone 46 that originated from 
a fortuitous ligation event is represented by a dotted line. 
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of EcoRiPChal cleaved genomic DNA (data not 
shown). This 5.4-kb band was cloned by screen- 
ing a genomic library of EcoRiPCbal cut genomic 
DNA ligated into A GEM2 vector, with radiola- 
beled clone 100 insert. In this way genomic clone 
RX4 (insert size = 5.4 kb) was obtained. 

Sequence analysis. The inserts of overlapping 
clones RX4, 46, 100 and R16 were subcloned into 
M13mpl8 or pUC18 vectors and partially or to- 
tally sequenced. Together these 4 inserts spanned 
a region of 6409 bp. This sequence is presented 
in Fig. 2. One major open reading frame can be 
found, starting at nucleotide 667 and terminating 
with a stop codon at position 6022. This codes 
for a protein of 1785 amino acids (AA) with a 
calculated M T of 198 886. The A+T content of the 
sequence is high with an average of 67% in the 
coding region and 83% in the 5' and 3' untrans- 
lated region. The A/T ratio of the coding strand 
is 1.74. This biased A/T ratio of mRNA sense 
strands appears to be a general phenomenon for 
malarial genes [44]. 

The encoded projein has many of the basic fea- 
tures of other PMMSA. At the aminoterminal end 
a putative signal peptide (residues 1-19) is present 
while a stretch of hydrophobic amino acids, prob- 
ably functioning as a membrane anchor sequence, 
is found at the C-terminus (residues 1765-1785). 
Ten out of a total of 20 Cys residues are lo- 
cated in the last 1 10 A A of the protein. Eight po- 
tential /V-glycosylation sites are present. Several 
tandeni repeat oligopeptides, mostly incompletely 
conserved, are scattered throughout the protein. 
Analysts at the nucleotide level shows that the 
individual repeat units are clearly related. Most 
conspicuous is a stretch of incompletely conserved 
7*6 AA starting at residue 324. Although the other 
tandem repeat structures are not so extensive, they 
do occur in many different types. In addition to the 
above mentioned hexapeptides, tri-, tetra-, penta- 
and heptapeptide tandem repeats can be observed. 
Also, a stretch of 7 consecutive alanine residues 
and a string of 7 consecutive aspartic acid residues 
are present in the sequence. These can be consid- 
ered as monocodon repeats. 

Comparison to PMMSA of other species. The P. 
c. chahaudi PMMSA sequence (pi 99) was aligned 



to known PMMSA sequences of other species us- 
ing the PALIGN program (PC/GENE). With P. y. 
yoetii YM PMMSA (pi 97) an overall homology 
of 69% was detected at the A A level. This homol- 
ogy is not equally distributed along the protein se- 
quence but is clustered in large zones of high ho- 
mology interspersed with 4 areas of very poor ho- 
mology (Fig. 3). Large insertions and/or deletions 
have occurred in these areas. Interestingly, all rep- 
etitious sequences are found in these regions. The 
20 Cys residues present in both proteins are com- 
pletely conserved. Alignment with either P. falci- 
parum PMMSA allelic sequence (isolates Kl and 
MAD20) displayed 33% overall homology. This 
homology is almost exclusively confined to the 
conserved and semiconserved blocks, as defined 
by Tanabe [16] (data not shown). The few patches 
of high homology that occur in the variable blocks 
are very often also conserved between the 2 P. fal- 
ciparum alleles. The degree of homology appears 
to be as high with the semi-conserved as with the 
conserved blocks. From this alignment it is also 
clear that the 4 divergent areas in the p!99/Piy 7 
comparison 'coincide' with the P. falciparum van- 
able blocks 4, 8, 10 and 14 (as defined by Tanabe). 

A crude epitope map of pl99 was established 
by screening the reactivity of a battery of 18 P i. 
chahaudi PMMSA specific monoclonal antibodies 
with a set of overlapping cDNA expression clones 
in Agtll. Fig. 4 summarizes the results. The ma- 
jor conclusion from this analysis is that all mono- 
clonals seem to map to the central third part of the 
molecule and that none is binding to the hexapep- 
tide tandem repeats. This area does not seem to he 
very immunogenic. These data also indicate that 
carbohydrate moieties do not play a major role in 
the immunogenicity of the molecule. 

Discussion 

In this study we present the complete pri- 
mary structure of P. c. chahaudi PMMSA (p]99i 
based on the DNA sequence of the correspond- 
ing gene. PI 99 exhibits similar characteristics to 
other PMMSA. It has a calculated M r of 199 000. 
shows putative signal and membrane anchor se- 
quences and a clustering of Cys residues in the 
last 120 AA. 
Although the predicted molecular weights for 
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QSPETPVDPFTMPEFAQKLQPFILKFEELGFTEQTELVNLIKTLGPNKYGLKYLI 

TQPTET IDPFTNHNFAQQVQDF VTKFEGLGFTEQTELVNL IKALTPNRYGVKYL I 
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YRKP IEN IQDDLVKLEE YIARWKATAETLNTL I TEETKKITPEEE TDCNDTNCDN 

::: :: ::: : ; : :: :: :*:: * : 

YRKPIE^IQDDIEKLEIYIERNKETVAALNALIAEETKKIQPEGNEDCNDASCDS 

TK YGKKKAI YQ AMYNVI F YKKQLAE IKKV I E VLEKRVAT LKKNEA I KP LLQQ I eJa 



DKYNKKKP I YQAMYNVIFYKKQLAE IQKVVEVLEKRVSTLKKNDAIKPLWQQ IE V 
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LNAAPWTAETQIVTGG QSSTEP GSGGS 
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NTNKFWKNFYQLLDNDKKDKQMINIiECYAIKGVTEDIETATDGIEFFNKMIELYK -1537 

STNKYVIRMPYQLLDNDKKDKQIVNLKYATKGINEDIETTTDGIKFFinCMVELYN -1523 

PQLNAVNEQIAAIGTEPTDA EKKKYAPIFEDLKGLYETILNGAEEFSELLQH -1589 

TQLAAVKEQ I AT IE AETNDTNKEEKKK YIP ILED LKGLYETV IGQAEEYS EELQN - 1 5 7 8 

KLEN^IEKAGFDILMANLETYIRIDEKLEDFVESAEKNKHIASIAI^JNLNKSGL -1644 

RLDNYKNEKAEFEILTKNLEKYIQIDEKLDEFVEHAENNKHIASIALNNLNKSGL -1633 

VTEGESKKILAKMLNMDAMDLLGIGSNHVCISTS-TPDNAGCFRYDDGTEEWRCL -1698 
::::::::::::::::::::: : :* : : :::*:: : 

VGEGESKKIIAKMLNMDGMDLI^VDPKHVCVDTRDIPKNAGCFRDDNGTEEWRCL -1688 

LGFKKDDDGTOCVADDAPVCNNNNGGCDKNADCREVENTDRDPSKKIVCTCKEPN -1753 
: : : : : : * : : * :::;*: : * :: ;:::*:*::: 
LGYKKGE-GNTCVENNNPTCDINNGGCDPTASCQNAESTEM--SECKI ICTCKEPT -1740 

P NAYYAGVFCS S SGFMGLS I LL 1 1 TL I VFNLF -1785 



PNAYYEGVFCSSSSFMGLSILLIITLIVFNIF -1772 



Fig. 3. Alignment of pi 99 with P. xoetii PMMSA (pi 97). Residues that are part of repetitive structures in both pi 99 and p!VT 
are in bold face. Positions of Cys residues that are conserved are indicated by asterisks whilst conserved potential /V-glycojALtiiw 
sites are denored by +* signs. The four highly divergent areas have been boxed. 



of difference was observed between genomic clone 
RX4 and cDNA clones 100 and 46 in the areas 
where both types of clone have been sequenced 
(totalling about 1500 nucleotides). Taken together 
these data provide strong evidence that the pi 99 
gene occurs as a single copy in the genome. 

The protein sequence of pl99 is very homol- 
ogous to pi 97 (69%) and 33% homologous to 
both P, falciparum PMMSA allelic sequences. 
The same level of homology (31%) had previ- 
ously been observed between these 2 allelic se- 
quences and pi 97 [24], indicating that, as might 
have been expected, P. c. chabaudi and P. v. yoelii 
are evolutionary equidistant to P. falciparum. The 
major findings of the PMMSA interspecific com- 
parisons, as described here, are schematically re- 
presented in Fig. 5 and can be summarized as fol- 
lows: (i) pl99 and pl97 are rather homologous but 
differ extensively in 4 areas that correspond to P. 
falciparum PMMSA variable blocks 4, 8, 10 and 
1 2. On the contrary variable blocks 2, 6 and 16 are 
well conserved between pi 99 and pi 97. (ii) The 
homology that exists between pl99/p!97 and P. 
falciparum PMMSA is situated almost exclusively 
in the conserved and semi -conserved blocks. The 



variable blocks are nearly totally divergent. In 
these variable areas small patches of homolot:) 
occur that frequently correspond to stretches that 
are also conserved between the 2 P. falciparum 
alleles. 

It is assumed that the P. falciparum allele 
evolved in 2 biologically isolated populations that 
later on merged [ 16). The rodent malaria sequence 
data further illustrate the evolutionary behavioral 
PMMSA: some areas of the protein are well con- 
served whereas others are very variable. It is in- 
teresting that whereas 7 hypervariable areas arc 
apparent when comparing the 2 P. falciparum al- 
leles or when comparing P. falciparum PMMSA 
with the rodent malaria PMMSA, P. c. chahawii 
and P. y. yoelii only diverge profoundly in 4 of 
these 7 variable blocks. This might indicate thai 
variable blocks 4, 8,10 and 14 evolve even fa>ter 
than variable blocks 2, 6 and 16. 

The question might be asked as to why sonic 
regions in the molecule did evolve very rapidly 
while other parts changed at a slower pace. Clearly 
there must be structural and/or functional con- 
straints on the more conserved parts of the protein. 
These data might however also indicate that the 
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divergent zones were subject to positive selection 
which increased the rate of genetic change. Two 
types of positive selection that might be envisaged 
are the need to adapt to an evolving vertebrate host, 
and immunological pressure. Since PMMSA is 
considered to participate in the recognition and/or 
invasion of red cells [47], the variable blocks 



which differ considerably between different Plas- 
modium species and between the 2 P. falciparum 
alleles, might constitute the domains that mediate 
this interaction. It must be assumed then that the 
2 P. falciparum PMMSA alleles evolved to inter- 
act with different structures on the human red cell 
membrane (as proposed by Tanabe et al. [16]) and 
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also that P. c. chahaudi and P. y. yoelii PMMSA 
recognize different groups on the mouse red cell. 
It is however possible that the need to interact 
with different red cell structures in different hosts 
can be accommodated by minor AA changes in 
the more conserved areas. 

Alternatively the rapid genetic change in the 
divergent areas might have been generated by im- 
munological pressure. The repetitive motifs which 
are found in these areas might be the result of 
special genetic mechanisms that warrant rapid di- 
versification for the evasion of immune responses. 
The strain-specific protection shown by PMMSA 
of P. c. chahaudi AS and CB [48] and the iso- 
lation of anti-PMMSA mAb resistant lines from 
cloned P. c. chahaudi AS [49] is in keeping with 
this hypothesis. Interesting also in this respect is 
the observation that P. c. chahaudi PMMSA ap- 
pears to be even more polymorphic than P. falci- 
parum PMMSA since every field isolate out of 15 
tested belonged to a different PMMSA serotype 
(McLean, A.P., Ph.D. -Thesis, University of Edin- " 
burgh. 1986). 

At the same time these variable regions are 
probably not very immunodominant. This is in- 
dicated by our epitope mapping studies. The first 
variable area (block 4) is not recognized by any of 
the PMMSA specific monoclonals while the epi- 
topes of 4 mAbs were shown to map outside of 
the 4 divergent areas. On the basis of its similar 
structure (alternating tripeptide repeats) the major 
repeat area in pi 99 (residues 324-365) appears to 
be the homologue to the tripeptide repeats seen in 
PMMSA of most P. falciparum isolates. However 
the P. falciparum tripeptide tandem repeats occur 
in variable block 2 whereas the p!99 repeats are 
found in variable block 4. The tripeptide tandem 
repeat area in P. falciparum PMMSA is widely 
polymorphic among different isolates. Work is in 
progress to determine whether a similar polymor- 
phism prevails in different P. c. chahaudi strains. 
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ABSTRACT Merozoite surface antigen 1 (MSA1) of sev- 
eral species of Plasmodia has been shown to be a promising 
candidate for a vaccine directed against the asexual blood 
stages of malaria. We report the cloning and characterization 
of the MSA1 gene of the human malaria parasite Plasmodium 
vivax. This gene, which we call Pv200, encodes a polypeptide of 
1726 amino acids and displays features described for MSAl 
genes of other species, such as signal peptide and anchoring 
sequences, conserved cysteine residues, number of potential 
N-glycosylation sites, and repeats consisting here of 23 gluta- 
mic residues in a row. When the nucleotide and deduced 
amino add sequences of the MSA1 of P. vivax are compared to 
those of another human malaria parasite, Plasmodium falci- 
parum, and to those of the rodent parasite Plasmodium yoclU, 
10 regions of high amino acid similarity are observed despite 
the very different dG+dC contents of the corresponding genes. 
All of the interspecies conserved regions reside within the 
conserved or semicoaserved blocks delimited by the sequences 
of different alleles of the MSA1 gene of P. falciparum. 



The surface of the invasive merozoite of Plasmodia consti- 
tutes one of the potential targets of a vaccine directed against 
the blood stages of malaria. Merozoite surface antigen 1 
(MSA1), described by Holder and Freeman in 1982 (1), has 
been extensively studied in the human malarial parasite 
Plasmodium falciparum (reviewed in ref. 2). There are sev- 
eral allelic forms of this polymorphic high molecular weight 
antigen, and conserved, semiconserved, and variable regions 
can be found in the different alleles (3-5). The antigen is 
processed on the surface of the merozoite, although the exact 
stage at which processing occurs is subject to discussion (6). 
MS Al has also been shown to bind in a specific manner to the 
surface of erythrocytes and could thus constitute one of the 
merozoite surface ligands involved in invasion of the eryth- 
rocyte (7). 

A number of immunization experiments performed with 
parasite-derived or recombinant MSA1 or with MSA1 pep- 
tides in monkeys (reviewed in ref. 2) as well as in humans (8) 
point to this antigen as one of the most promising vaccine 
candidates against malaria asexual blood stages. P. falci- 
parum is the only human malarial parasite for which the 
protective properties of the MSA1 have been assessed. Since 
protective immunity in malaria is species-specific (9), it is 
unlikely that a vaccine against one species will protect against 
others. Although Plasmodium vivax is the most widely dis- 
tributed human malaria parasite, little is known about the 
properties of MSA1 in this species (10); this is partly due to 
the difficulty in obtaining large quantities of a parasite that 
cannot be maintained in continuous culture. The cloning and 
characterization of the gene coding for the MSA1 of P. vivax 
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should allow appropriate immunization studies to be per- 
formed with recombinant proteins. 

A portion of the P. vivax MSA1 gene (Belem strain) has 
been previously characterized (11), and we present here the 
complete primary structure of this gene,t which We call 
Pv200. The organization of Pv20Q is similar to that of the 
MSAl gene of P. falciparum, Pfl90 (3, 12), and to that of the 
rodent malaria parasite Plasmodium yoclii, Py230 (13). There 
are 10 regions of high amino acid similarity conserved among 
the three parasite species. Since this molecule, like many 
other P. vivax antigens, is otherwise polymorphic (14, 15), 
such regions of interspecies conservation could be of impor- 
tance in the development of an asexual stage malaria vaccine. 

MATERIALS AND METHODS 

Parasites. The P. vivax Belem strain, adapted to Saimiri 
monkeys, was used for the production of DNA (11). 

Construction and Screening of Genomic DNA Libraries. 
Two DNA libraries were constructed: (i) Library A. Genomic 
DNA was completely digested with EcoRI and 5 fig was 
fractionated on a 1% agarose gel. Fragments between 5 and 
15 kilobases (kb) were electrocuted from a slice of the gel, 
extracted with phenol, and precipitated with ethanol. Pellets 
were washed, dried, and dissolved in double-distilled H 2 0. A 
l-/ig aliquot was ligated into the EcoKl arms of the A vector 
gtWES (GIBCO/BRL) according to the supplier's instruc- 
tions. The library was obtained by transforming LE392 
competent cells and it was screened with a 1.9-kb DNA insert 
containing a portion of the Pv200 gene, Pv200/1.9 (see 
Results) (11). 

(ff) Library B. A 0.5-/*g sample of /fwdlll-digested DNA 
was ligated into the //mdlll site of the vector pBR322 treated 
with calf intestinal alkaline phosphatase (Pharmacia) and the 
library was obtained by transformation of DH5 a competent 
cells. The library was screened with a 0.98-kb DNA insert 
corresponding to the first 0.98 kb from the 5' end of the 
Pv200/1.9 clone (see Results). 

All enzyme digestions and DNA manipulations were per- 
formed as recommended in Sambrook et al. (16). 

DNA Sequences. Dideoxy chain termination sequences (17) 
were obtained by the production of exonuclease 111 overlap- 
ping deletion clones (18) or by the use of oligonucleotides 
(17-mers) synthesized on an Applied Biosystems PCR-Mate 
apparatus. Both DNA strands were sequenced for all the 
results presented here. Sequences were aligned and analyzed 
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by using the DNA program of Staden (19). The sequences 
used for homology studies were those of P. falciparum 
MAD20 (3) and of P. yoelii YM (13). 

RESULTS 

Isolation of the Genomic Clones Containing the Entire Pv200 
Gene. We have previously reported the isolation of a clone 
containing a 1.9-kb genomic DNA insert, clone MOO/1.9, 
including a portion of the P. max Belem strain MSAi gene 
(Pv200) (11). Using the Pv200/1.9 DNA insert, we isolated 
two new clones, Pv200/7.0 and Pv200/9.0, containing the 5' 
and 3' ends of the MOO gene, respectively. 

On Southern blots of P. max genomic DNA digested with 
EcoKl the Pv200/1.9 insert hybridized with a single 9-kb 
DNA fragment (not shown). Accordingly, Pv200/1.9 was 
used to screen 5 x 10 4 phage plaques from library A, and a 
positive clone, Pv200/9.0, was isolated. EcoEl digestion of 
DNA from this clone released a 9-kb insert, which was 
subcloned in the EcoKl site of the Bluescript vector (Strat- - 
agene). The nucleotide sequence of Pv200/9.0 showed that it 
contained the remaining 3.5 kb of the 3' end of the Pv200 
gene. 



A 0.98-kb fragment of Pv200/1.9 insert, obtained through 
digestion of Pv200/1.9 with tfuidlll, hybridized with a single 
7-kb band on Southern blots of genomic DNA digested with 
Hindlll (not shown). Library B was screened with the 
0.98-kb fragment. A positive clone, Pv200/7.0, was isolated 
and shown to contain a 7-kb insert, from which the sequence 
of the 5' end of the Pv200 gene was determined. 

Nucleotide Sequence of the P . max Belem Strain Pv200 
Gene. The complete nucleotide and deduced amino acid 
sequences of the Pv200 gene are shown in Fig. 1. A methio- 
nine start codon at base 91 initiates a single open reading 
frame of 5178 bases that finishes with the first TAA stop 
codon at base 5259. An A+T-rich noncoding region follows 
after this stop codon. Three observations indicate that the 
methionine codon at position 91 is the initiation codon in vivo, 
(i) There are two stop codons immediately upstream, at 
positions 64 and 76. (//) A poly(A) sequence precedes this 
ATG, possibly representing the consensus sequence for 
translation initiation as described for several plasmodial 
genes (20). (Hi) The amino acid sequence immediately fol- 
lowing this ATG codon has all the features of a putative signal 
peptide (21). The sequence presented here is based entirely 
on genomic DNA fragments. We believe, however, that the 
Pv200 gene contains no introns, since a continuous open 



TQCAACAATMTQCCCTTTTTTQGGOtXECCACTGTTQCTACA 



ACAGAAAGTTATAAGCAGCriGTACCGAACCTGCACAACTTAGAAQCGCTOC^ 

T C S I I 0 L Vi I V D R L U L V V D G t E l r I I n IC G II D I n D t n n H I K of V S TO 
GTTHAACTTCCAAAATAAGAM1 MUIIU^LAAGTTTTTGOIAOCTACAAAT^ 

VLTSKIRttrVGKFLELOIPCHTOLLHLIRELArCPIICIKVLVCSrCCrHOUO 
CTCATOCACGICAtCAACTia^AT^TTTCTTCWOra 

LMflVIHFKYDLLRAHV8DHCAflOYCXlPE8LKISDKELOMLXXVVLGI.IfKl?0 
CCCTTGCACAACATAAAGGACGATATTGCAAAATTCCAGACCnCATCACTAM 

PLOHIKODICKLETFITKHKETISNINKLI SDEHAIRGCQSTIfTTHCPCAIIO 
CAAAACAAWrrOCTCAACTCMtaOOCto 

OKHAAOCSTGKTETGTRS3ASSIITLSGGDCTTVVCT3SPAPAAPSST1IED170 
TACCAOMGAAGAAAAAAATCTACCiUCXXATO^ 

YDEXKKIYQAMYNGirYTSQLEEAQKLIEVLtKRVKVLXQHKGIKALLEO 120 
GTCGAAOCAGAAAAGAAAAAGCTTCCAAAAGATAATACCAOCMTCCACOO 

VEAEXXKLPKDNTTNRPLTOtQQKAAQKKIADLESQJVAHAXTVXrDlDG3?0 
CTCnTAOGGACGCAGACGAGTTGGACTACTACTTGAGGGAGAAGGCAAAG^ 

LFTDAEELCYYLREKAKMAGTLIIPESTXSACTPGKTVPTLKETYPHGIS 420 
TACffiTTTAGCAGAAAACAGTATTTATCAACTCATAGAAAAM 

YALAENSIYELI EX IGSDETrCDLQHPODGKQPKKGILINETKRKELLEK 470 
ATTATGAATAAAAnAAGATAGAACAACACAAAnOCCCAAOCTAAAAAAAGM 

IKNKIKIEEOKLPNLKKELEEKYKVYEAKVIf&rKPAFNflFYEARLDNTLV S20 
GAAAACAAATTTGATCAATTTAAAAOMAAAGGGACGCATATATGGAGCAGAAGAA 

EKKFDEFKTXREAYMCEKKRLESCSYEOHTRLINKLXKOlTYLEOrVLRKSTO 

gacatcgccgacgatcaaattaaacacttcagtttcatogagtoga 

DIADDEIXHrSFHEHKLK3EIYOLAOElRKHEIIKLTVEHXr0FSGVVEGO620 

gtacaaaagctattgataatcaaaaamttgacgctctaaagaatgtccagaatcttc 

V'OXVI,:iKXIEALKIIVQNLlKtlAKVXDDlYVPKVYNTC£KPEPYYLHVLK670 
AGCCAAATTGAC!AAGTTGAAGGAC1?CATOOCCAAAATCGACACCATCAT^ 

RCIOKLKDFIPKIESHIATEXAKPAASAPfVTSGQLLRGSSEAATEVTTHA 720 
GTAACATCtGAAGAtCAACAACAACAJOACAACAACAACAACAACA^ 

VTSEDQQQQOOQQOOQQQQ0Q0OOQOOQSQVVPAPAGDAQQVISTOPTSQ77O 
TCCGCAGCACCAQGCGTAlCAGCCACACCAGCAOCAACACCItjCTGOOCXIAOOCGOC 

SAAPGVSATPAPTPAAAAAPAPAHSXLCYUEXLLDFLKSAYACUX&IFVT 820 
AACTCCACCATGGACAAGAAACTACTCAAAGAGTAOGAACTTAAOGCTCATGAGAAAAA 

H3TMDKRLLXEYEL8ADEKNKIH0HKCDEL0LLFKV0HNLPAMYSIID3H 170 
AreAACGAOCTCCACAATCTTrACMTGACCTGTAOCACAAGCiAAATG^ lUJ UUCATCOUCAACAAAG O XXTaTa^ 

3NELQHLYIELYOKEMVYHIYKHXDTDKKIKAFLETSIINKAAAPAQSAAK 920 
OCCAGCGGTCAAGOGCAGTACTACTTCAGTAAGGACAACTGOGCCAGTAAC^ 

EKPEAOTAOVEXFYDXBLSOIDKYNDYFXKFLESXXEEIIKKDDTXHHAL 970 
GAGAAACCOGAAGCOCAOOGCGCAAGTGGAAAAGTTTTACGACAAGCACCTATOQCAM 

PSCQAEYYSSNDHCASKHNNSYSK3P>riSC»XflTSTFQA£E»QRVGGtrSE102 
GCTAAAGAAAT/TGAGCAACTCAAGAACAAGCTACAAGTATCTCTGGACCACT 

CXEIEELKXKLOVSLOBYGKYXLKLERFLXKXIIKISItSKDOIKKLT9LKNt07 
AWTCCAGAGAAGACAAAATCTCnGAATAACCCAACAAGTJGTGTTGAAAAATO 

X L E R R O N L L V » P TSVLXNYTArriXXRETEKKEVEIITtKHTEILLXYYXA112 
CGTCCCAMTATTATATAGGAGAGCCCTTCCCTCTXaAGACCnAAGTCAACW 

RAXYYlGEPrPLXTLSEE5NQKEOKYLNLEKFRCSADNREIRK0TELER5 117 
AACATAAGCTACCTCTCCAGTOMCTGCTOCAGGTCCTTGACAGAGCTC 

CTAACCTCTciaiAAGCACAT^ DXXrSGKDHAKIfIAEVKXALQArQELIPK122 

CCAQCAGGAGGACCATCACCACCAGC^^ CAVVPCVPTAAAAGSGASCAVPPAAAACSCASGAVP127 

PACGP3PPATGCVVPCVVESAEAOTKAOAOOTAEOYOKVIELPLFGKX00 132 
GAOCX»»OGAACAOC^AACAACaX»C*^^ 

DCEEDQVTTGEAESEAPEJLVPACISDYDVVYLXPIACHYXKIXXQLEHB137 
GTAAACGCATlTAACACTAACAfAAOCXMTATtTrTAGACTCTAGACT^ 

VNAFHTHITDHLDSnLKXRNYFLEVLHSOLNPrKYSPSGSY I IXDPYXLL142 
GACTTCX^GAAGAAGAACAAOCTTCTAOGCAQ^ 

OLEXXKKLLGSYKYlGASIOKDICTAttOCVNYYRRHGELYXTBLTAVIfEEU? 
CTTAACAAACTCGAAOCTGATAHAAAOCAGAACATGATAACATTAAAAAGATAOGAAGT^ 

V K X V E A DIKAEDDKIXKIGSDSTXTTEKTOSHAXKAELEKYLPri.tf3I.QXU2 
( *ACTACGAGTCCETCGTGAQCAAGGTGAACACCTACACAGACAAOT 

EY EStVSKVHT YTDHLKXV INNCOLEKKEAEI TVKXLQOYMKHDEKLEBY157 
AAAAAATCOGAGAAAAAAAATCAACTCAAGTCTTCTGC^ 

KKSEXKHEVKSSGLieKLNKSXLIKEHESKEILS0LLIIVOrOLLTHSSEHl(2 
ACATGTATAGACACCAATtnOCCTXJATMTGCACCCTCCTATAGGTACTTCGA^ 

TCIDTHVPDHAACrRYLDCMEEIfRCLLTFXEEGGXCVPGSHVTCKDHKGG16J 
TlTaianJAOCTGMTCTAAAATCACOGACAGC^^ 1 I L1G1 AOC1tXTCCAOCTTCCTAA a i1 UailU1l.rta -r U T 0 1 T UC I 1 1 I t 

CAPEAECKMTDSNKIVCKCTKEGSEPLFEGVrCSSSSrLSL SFLLLMLLF 172 
CTCCTCrGCATOCACCTrTAAAMTAACACAAATAAAAGTOCAXAAGTCC^ 1 1 1 1) 1 1 11 111111 1 1 IICIUJLI AOCATTTTCACTTCTCAACTO CjU. 1 I MuCTttACACTTC^ T T TTTTTCTTT 

L I C H E L 

ara 



Fio. 1. Nucleotide sequence of the MOO gene of the Belem strain of P. vivax and the deduced amino acid sequence. The position of the 
original MOO clone (11) is indicated by the arrowheads. Signal and anchoring sequences are underlined with broken and solid lines, respectively. 
Amino acid residue numbers are given on the right (numbers 1020 and higher lack the final 0). 
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reading frame of 1726 amino acids with a calculated molec- 
ular weight of 194,267 is contained within the genomic 
fragments. This is in agreement with the absence of introns 
in the genes coding for the MSA1 of other species. 

There are a potential signal peptide and a hydrophobic 
membrane anchor sequence at residues 1-17 and 1710-1726, 
respectively. Furthermore, there are 12 potential N-glyco- 
sylation sites (Asn-Xaa-Thr/Ser) and 22 cysteines, 11 of 
which are located within the last 110 residues of the COO 
terminus of the molecule. The Pv200 sequence also contains 
a stretch of 23 glutamines at residues 726-748. 

Comparison of the codon usage in the MSA1 genes of P, 
vivax, P. falciparum, and P. yoelii revealed that codons 
which have G or C in the third position are more frequent in 
P. vivax. Consequently, the dG+dC content of the Pv260 
coding region is 43,4% and differs significantly from the 
dG+dC content of the coding regions of the MSA1 genes 
from P. falciparum (25.7%) and P. yoelii (31%). 

Comparisons of the Pv200, Pfl90, and Py230 Sequences. The 
deduced amino acid sequence from the Pv200 gene was 
computer-aligned with the sequences of the Pfl90 (allele 
MAD20) (Fig. 2) and Py230 YM (Fig. 3) polypeptides. There 
is an overall identity of 35.6% and 34.3% with the P. 
falciparum and P. yoelii sequences, respectively, 

Interestingly, 17 out of the 22 cysteines of the Pv200 
polypeptide were located at similar positions with respect to 
the Pfl90 and Py230 sequences. These similarities include the 
11 and 10 cysteines found at the COO terminus of Pfl90 and 

J2JU 



Py230, respectively. In contrast, of 12 (MOO), 15 (Pfl90), 
and 11 (Py230) potential N-glycosylation sites, only 3 were 
conserved at the same positions between the P. vivax and the 
P. falciparum sequences, whereas only 1 was conserved 
between the P. vivax and P. yoelii sequences. 

To determine the regions with an amino acid identity near 
50% among the three parasite species, we combined the 
comparisons which had been made between Pv200-Pfl90/ 
Pv200-Py230 (this work) and Py230-Pfl90 (13). Fig. 4 shows 
the result of such analysis. Seven ICBs were observed: ICB1, 
ICB2, ICB4, ICBS, ICB6, ICB8, and ICB10. Similarly, three 
other blocks (CB3, CB7, and CB9) were conserved between 
Pv200 and Pfl90 but not between Pv200 and Py230 and thus 
could not be treated as bona fide ICBs. All these blocks reside 
within the conserved or semiconserved blocks of the Pfl90 
alleles (3). 

DISCUSSION 

We report the complete primary structure of the MSA1 gene 
of the P. vivax Belem strain, Pv200. The general structure of 
the gene resembles that of the MSA1 genes described for P. 
falciparum and P. yoelii, with a number of homologous 
regions and other features such as (/) conserved cysteine 
residues at the COO-terminal region, (if) number of potential 
N-glycosylation sites, and {Hi) the presence of 23 glutamines 
in a row, which could correspond in P. vivax to the repeated 
sequences described in the MSA1 genes of other species. 
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Fig. 2. Comparison of the amino acid sequences of the P. vivax Belem strain Pv200 (upper sequence) and of the P. falciparum MAD20 Pfl90 
(lower sequence) (3). Sequences were aligned by using the program of Staden (19). Hyphens indicate gaps introduced for alignment; colons, 
identical residues; and periods, similar residues. Positions of the Pv200 cysteine residues conserved between these two proteins are denoted 
by • and those that are not conserved, by 0. The position of the Pfl90 blocks determined by the sequence of different alleles (3) is also shown; 
conserved blocks are indicated by unbroken underlines and overiines, semiconserved blocks are indicated by broken underlines and overiines, 
and variable blocks are unmarked. 



Medical Sciences: del Portillo et al 



LA 

^L^FSFIFmKCQCET-BSYKQLVA^ 

»OTIC^SFVFFAIKCKSETIEvWlQK^ IMTQPTET Wr™--fA«VO^ 

CAmCEIPEHUaSEEETBHUHCVIKYI^ 
ANYNGIFYTSQLEEAQKLIEV 



Proc. Natl. Acad. Set. USA 88 (1991) 4033 



LEKSVKVLKQRKG IKALL&QVEAEKKK If KDNTTM — RnTDEOQKAAQKKIADLESQIVANAKTVNFDLDCLFTOAEZl£YY 

AAPV^AETQIVTGGQSSTEPGSGGSS ASCTSSSGQASAGTCVEC^WASVTVT PS 1 ^ 

TPGKTVPT^TYPHGISYAIABKSIYELIEKIGSDETFGDWHPOM 

"SETIIPLTIRYPHGISYPti'ENimNK^ 

SCSYIWmiNKUOCQLTYI^YVUWDIA^^ 

nut. in. t:. it liiti ..:i..ti .. .t. it : i ( ,.., t . :: . . , j , . , , : . . , t . .[ ., 

TCEYCHTKELIHKU(KQI^U»YSUlKOIISMEIEYFSNKKKEWYNIHRLAEAVQAKQW^ 

^iai^erWpai™se^ 

™dkkimeyelhadeknkinqhkcdeldujhvonnij»amysiydsmsne^ TS 

TjraDA^YAM^EEDl^TLK^EimiA ioKHMPTt^LYEsivDG IflNI YTELYEKEHHYH I YKWDEKPSI KSLLVKACVI EPEPVAAPTPVTPAATEOQOOQATPDVOSDAPSDVSOQPETPVTSTTPBVTT 
«NKAAAPAOSMKPSGOAEYYSSKDHCASKHWISYSKSPNlS?«KHTST? 0AEENQRVGGN5EEKPEADTAOVEKFYDKHUQIDKYHDYFKKFIZSKKEEIIKKDDTK 

steasssapcectpsgeacasgtegataWtpactcaw 



WNAWKEIEEUKKIWS^HYCKYKW^Fl^KNKISNSKDOIKKLTSIJCffi 

W^^U^H^YSTYj^iiw^ 

DHR-EIRKDTCLERSKISYUSGLlityUJRAEEIIMDKKYS^ 

R^LRI^INl^mSYVSCC^^ TTLAADAP ATPEGAVP GAVP — -CAVP GAVPGAWGAWGSGTDT 

kaoaqdyaedydkvieijijgnhddkeewt^seapIilvpTc 

RVACSS ^VDDN^DDIJ^AM^EDWE^Di LSE^ 
GASIDKDUITANMVNYYHWWELYKTHLTAWEEmVEADIKAEDDKI^ 

TKG I NED IETTTOG IKFFff KHVE LYKTOLAAVKEQ I AT j EAETHPTHKEEKK —KYtpji^UGLYETvi^^ 

KNEVKSSGLLEKWKSKUKENESm^ 
WKHlAS[ALNNlJIKSCLV (S<3Sffliu^ 
FLSLSFLILMLLFLLCKEL Pv2Q0 



HGLSILU ITLIVFHI-F Py2J0 



Fig. 3. Comparison of the amino acid sequences of the P. max Belem strain Pv200 (upper sequence) and of the P. yoelii YM Py230 (lower 
sequence) (13). Sequences were aligned by using the program of Staden (19). Conventions are as in Fig. 2. 



Malaria parasites have been divided evolutionarily into 
three groups according to the base composition of their DN A 
(22). One group, comprising avian, rodent, and falciparum 
malarias, presents a genome with a low dG+dC content 
(18%). Another, comprising the two monkey malarias Plas- 
modium knowlesi and Plasmodium fragile, presents a ge- 
nome with a higher dG+dC content (30%). Finally, the group 
of P. vivox and Plasmodium cynomolgi, human and monkey 
malarias which cause relapses, has a genome presenting both 
low and high dG+dC components. This division implies that 
homologous genes and their proteins should be more similar 
within a group than between groups (22). Our observations 
show that in the case of the MSA1 genes and their proteins 
this prediction is supported only at the nucleotide level. 
Indeed, the low dG+dC content oftht P/190 and Py230 genes 
leads to a higher similarity, at the nucleotide level, between 
them than with Pv200. However, when the amino acid 
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composition is considered, Pv200 and Pfl90 antigens show 
higher similarity and the overall distribution of their shared 
amino acids is more highly conserved than when Pfl90 and 
Py230 are compared. That a higher amino acid similarity and 
closer overall distribution are observed in the Pv200 and 
Pfl90 antigens despite their very different total dG+dC 
content most likely reflects the effects of positive selection 
within the human host. Accordingly, three regions of homol- 
ogy between the MOO and Pfl90 antigens not conserved 
between the Pv200 and the Py230 antigens can be found (Fig. 
4), 

The analysis of the primary structure from different alleles 
of the MSA1 gene of P. falciparum allowed the definition of 
conserved, semiconserved, and variable regions within the 
molecule (3). One of the regions of amino acid identity higher 
than 45% conserved between the Pfl90 and Py230 antigens 
resides within a variable block of one of the Pfl90 alleles and, 
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Fig. 4. Representation of the MSA1 antigen based upon amino acid conservation among the MOO, Pfl90, and Py230 proteins (inner blocks) 
and upon Pfl90 alleles (outer blocks; solid-outline blocks, conserved areas; broken-outline blocks, semiconserved areas) (7). Shaded boxes 
represent interspecies conserved blocks (ICBs) with greater than 48% identity among the three parasite species. Hatched boxes represent 
conserved blocks (CBs) with greater than 50% identity between MOO and Pfl90 but not between MOO and Py230. Open boxes represent areas 
of less than 45% identity. Positions of ICBs and CBs (amino acid residues of the MOO sequence): 1CB1, 1-50; ICB2, 107-200; CB3, 274-319; 
ICB4, 348-387; ICB5, 620-691; ICB6, 796-895; CB7, 1040-1088; ICB8, 1092-1153; CB9, 1347-1464; and ICB10, 1622-1727. 
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consequently, Lewis (13) proposed the delimitation of new 
conserved blocks within the MSA1 antigen based on inter- 
species conservation. We decided to conduct a similar anal- 
ysis; regions of 50 or more contiguous amino acids presenting 
50% or higher identity among the three species (Pv200 vs. 
Pfl90 and Pv200 vs. Py230 (this work) and Py230 vs. Pfl90 
(13)] are referred to as ICBs. Subsequently, the position of 
such ICBs with respect to the blocks delimited by sequences 
from different PflSK) alleles (3) was also examined. 

Ail of the ICBs of MSA1 described here reside within the 
conserved or semiconserved blocks delimited by different 
alleles of the P. falciparum gene (Fig. 4). That such well- 
defined regions of MSA1 have been conserved among these 
three different malaria species could be explained because 
they are functionally or structurally important for the mole- 
cule, or because they are not immunogenic, or, finally, 
because immune responses against them do not block para- 
site growth (23). On the basis of these results, we predict that 
as sequences from other alleles of the Pv200 gene are 
described, the general structure of the Pv200 gene will 
comprise blocks that will be organized in a fashion similar to 
that of the blocks delimited by different Pfl90 alleles. 

As for the protective properties of MSA1, most immuni- 
zation trials with P. falciparum have used either the whole 
molecule or fragments from the NH 2 -terminal part (reviewed 
in ref. 2). In particular, the two peptides used in human 
vaccine trials belong to the regions we have defined as ICB1 
and CB3 (8). This does not exclude other portions of MSA1; 
in particular, ICB10 corresponds to the most COO-terminal 
part of the molecule. The most remarkable aspect of this part 
of MS Al is that it contains more than half of all the cysteine 
residues that are conserved in position among the three 
parasite species. Significantly, a protective monoclonal an- 
tibody against a discontinuous epitope of the P. yoelii MSA1 
has been mapped to this region (24). Immunization trials with 
the MSA1 antigen of P. vivax have yet to be reported, and the 
potential protective properties of Pv200 can only be extrap- 
olated from experiments performed in other malarial species. 
The availability of the complete primary structure from the 
MSA1 gene off. viva* should now allow the assessment of 
MOO as a vaccine candidate. 
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Introduction 

Large scale in vitro parasite culture cannot 
prov.de sufficient quantities of either organ- 
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na* D? P >Tf e , SynthCsis and rec0I "bi- 
nant DNA methodolog.es are being evaluated 
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extensively for the production of potential 
protectee immunogens against the P roto Zoan 
parasites that cause malaria. Because of the 
seventy of the disease caused by PlasZJum 
f^vanan, this species has represented th" 

ooZ^V" SUCh StUdieS ' ° f the n ^erou 
potential subunit vaccine candidates from this 
organism an antigen of special interest ha 

PnSs Tv maj ° r mer0Z0ite surface antigen 
Pfl95. This antigen has been shown to reside 

?ol 7 Z ° f thC , SChiZ ° nt and > in a P roc « ed 
form, on the surface of the merozoite [1] 

cuiZh 110 " Studies u with P «95, isolated from 
cultured parasites, have led to high levels of 
protection against P. falciparum challenge ?„ 
monkey model, systems [2]. More recently 
recombinant DNA-derived Pfl 95 antigens' 
have been produced and studied as cand date 
vaccines against falciparum malaria [3-5] 
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Although less virulent than P. falciparum, 
Plasmodium vivax is the causative agent of 
benign tertian fever, a form of malaria 
characterized by frequent and protracted 
relapses. Vaccination studies against this 
species have been less extensive than with P. 
falciparum and have been focused almost 
exclusively on the sporozoite lifecycle stage of 
the organism [6-9]. Recently, however, a 
polymorphic 200-kDa component of the P. 
vivax schizont surface was defined by mono- 
clonal antibodies, and a partial genomic clone 
that encoded a- portion of the antigen was 
isolated, structurally defined and expressed in 
bacteria [10]. Homology of this DNA frag- 
■ ment, and the more recently isolated full length 
gene sequence [11], with the Pfl95 gene, 
together with immunolocalization studies 
using antisera to the expressed protein have 
suaaested strongly that the Pv200 protein is 
functionally analogous to Pfl95 [10]. Here we 
report the molecular cloning and structure 
analysis of the gene for Pv200 from the Sal-1. 
strain [8] of P. vivax. We also demonstrate that 
amino- and carboxy-terminal domains of the 
protein produced in yeast can be used to detect 
antibodies both in monkeys and in humans 
previously infected with P. vivax. 



Materials and Methods 

Construction of P. vivax genomic DNA librar- 
ies. Genomic libraries were prepared as 
follows. 500 ng of P. vivax genomic DNA, 
from the Sal- 1 strain, was digested with EcoRl 
and ligated into EcoRl digested /.ZAPII 
(Stratagene), packaged and introduced into 
Escherichia coli strain PLK-17. A similar 
Hindlll digest, partially filled with dCTP and 
dTT.P, was' ligated into Xba\ digested AZAPII 
that had been similarly filled with dATP and 
dGTP. The ligated DNA was packaged and 
transformed as above. Libraries of 5 x 10 
and 4.1 x 10 7 . independent clones were 
obtained, respectively. 

Screening of P. vivax DNA. Two overlapping 
oligomers (45-mers), based on the sequence of 



del Portillo et'al. [10], were labeled by the 
oligomer primed extension method [12], hy- 
bridized in 40% formamide-containing buffer 
[13] at 37°C to a Southern blot of EcoRV 
digested P. vivax DNA, and washed at 65'C in 
2 x SSC/0.1% SDS. A 9.5-kb EcoRl fragment 
hybridized to this probe. Two overlapping 
oligomers (42- and 43-mers) based on the 5'- 
end of the EcoRl clone were used to probe a 
Southern blot of tf/>idIII-digested P. vivax 
DNA in a similar fashion and hybridized to a 
7.0-kb fragment. Similar hybridizations were 
carried out on library filters and plaque 
purified tertiary positives were excised using 
E. coli strain XL 1 -Blue (Stratagene) and 
plasmid DNA was retransformed into E. coli 
strain D 1 2 1 0 for further plasmid manipulations. 

Subcloning and DNA sequencing. Plasmid 
DNA was isolated by the alkaline lysis 
method [13]. Overlapping restriction frag- 
ments were subcloned into M13 vectors and 
both strands were sequenced by the chain- 
termination method [14] using M13 primers a^ 
well as specific internal primers. DNA mani- 
pulations were essentially as described [13]. 

Expression of amino- and carboxy-termmai 
domains of Pv200 in Saccharomyces cerevisiac. 
The polymerase chain reaction (PCR) [15] wa* 
used to amplify DNA fragments from cloned 
Pv200 gene sequences. Appropriate restriction 
sites, and in-frame initiation and termination 
codons were incorporated into the PCR 
- primers. Thus, for Pv200A, primers 5'-(dA- 
TGTCCCATGGAAACAGAAAGTTATAA- 
GCAG)-3' and 5'-(dCGCCCTCAACAAA- 
TCATAGTG)-3' were used to amplify an 
NcollEcoRl fragment from the Hindlll clone 
4B-3-9. A pBluescript polylinker primer. 5'- 
(dGTGGATCCCCCGGGCTGCAGG)-3' 
and the 3'-primer 5 -(dTTCC A AGGTCG AC- 
T ATGG ATTTTGC A A ATC ACC A A ATGT )- 
3' were used to amplify the contiguous EcoR\ 
Sail fragment from the EcoRl clone 6. 1-2. The 
PCR products were digested with the appro- 
priate restriction enzymes and ligated into 
NcoljSatl digested pBSlOO [7]. A BamHl;Sal\ 
fragment that contained the ADH2/GAPDH 
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obtained from Sainnri botiviensl squTrrel 
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controls to assess the efficacy of a'recomb nan 
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r ,7 ' w - v / cir c"msporo2oite vaccine [8,91. Anti- 
Cynomdgus conjugates were used to assay 

J weeks after challenge with 10000 /> vhax 
from SP ° rOZOi : es : Hu ™" sera were collected 
from a population in Brazil, outside the 
ndem,c zone, that had encountered a inSe 
50-day exposure to P. vivax fl91 c' ,f' e 
samples were taken 8 month? after^ oT 

chemotrT^ Wa , S C ° mplete,y Controlled "y 
chem 0th e rapy and insecticides. Enhancement 

of sens.t.vty m the human sera EL IS As was 

z^HMTMBf 3 K 3 '' 5 - ^t^lbT 
zidine. (TMB) as substrate and reagents as 

recommended by the manufacturer Kirke 
gaard^d Perry Laborator.es Inc.. Ga &£. 



Results and Discussion 

Molecular cloning of the Pv200 gene Over- 
app.ng ,ZAPII phage clones wfre isoiated 
that contained the large open reading frame 
encoding Pv200 (Fig. 1). Most of the' Pv'oO 

cTne (6?^™ iS enC ° ded by the *oRl 
clone (6 1-2 . The amino-terminal region is 

(4B-3-9). The full DNA sequence of a 5 83-kb 

St?^"2S- fragment that indudes ^ 
ent.re Pv200 coding sequence is shown (Fig 2) 

wkhm th P eT° ter d T ntS [2 °J are f O"nd 
within the 5 -region of the sequence, and a 

consensus motif f or efficient initiation of 
translate [21] is apparent around the pro 
posed initiation codon. P 
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Structure of P v2 00. As deduced from the 
composite sequence, Pv200 is derived from a 
1 75 1 -amino ac.d precursor protein that con- 

equenc/S Tr*™^ ^ 
S ; l , n a transmembrane domain 
( I MU) (Fig. ). A predicted signal peptidase 
cleavage site [22] occurs after Cys 19 Also 
within the hydrophobic- TMD (Fig ?) is a 

of the ,gnal for attachment of glycosylphos- 
phat.dyl mositol anchor sequences to protems 
of various protozoans, including Pfl95 [23] 
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DNA sequencing: tfmdlll (H). £coRl (E) and Jffcil (X). Po ent al tranS membrane domain (TMD) are shown in 

the senne/threonine-nch region (STR). The sig na are the regions Pv200A an,! 
black, and a potential proteolytic m yeast. 



Overall amino acid sequence identity between 
our Pv200 and Pf 195 from various sources [24] 
is in the 36-37% range, although a region of 
more extensive identity (45%) exists in the 
amino-terminal region [10, and present study]. 
Similarly, amino acid sequence identities 
between Pv200 and the major merozoite 
surface antigens of P. chabaudi chabaudi and 
p voetii voelii were 34.6% and 34.9%, 
respectively [25,26]. Also present is a potential 
proteolytic' cleavage site identical to that 
defined previously for the generation of the 
carboxy-terminal p42 protein of the Pfl95 
precursor [27] (Figs. 1 and 2, arrowed). Also 
in common with Pfl95, Pv200 does not contain 
multiple repetitive elements that are character- 
istic of many other proteins of malaria 
parasites. Six serine/threonine-rich motifs of 5 
amino acids are noted, however (Figs. 1, 2). 
Also, the striking 23-glutamine residue repeat 
of the Belem strain Pv200 protein was not 
present in the sequence encoded by our cloned 
gene but rather, was replaced by a 35-amino 
acid residue stretch that contained only 6 



alutamine residues. This, together with several 
other smaller insertions and deletions accounts 
for the larger size of the Sal-1 Pv200 precursor 
over that of the 1726 amino acid PvJX) 
precursor from the Belem strain. Despite this 
size difference, the two Pv200 precursors are 
relatively well conserved, with an overall 
amino acid sequence identity of 81%. As with 
the major merozoite surface antigens from al 
Plasmodium species that have been studied 
thus far, this intra-species homology could be 
divided into areas of the protein of relatively 



Fig. 2. Composite DNA sequence of a 5.8.3-kb 
fragment containing the Pv200 coding sequence The I . - 
amino acid large open reading frame is shown *»u 
^oposed aminoUrLal signal and carboxy-term.nal 
membrane-spanning sequences underlined. Also undc 
hned are 13 asparagine residues that are potential site, for 
fiWcosylation. A 30-amino acid senne/threonine-n.h 
SjonVween amino aads 241 and 270 that con am, * 
conies of the 5 amino acid repeat motif G.S.(S,T) (N > 
GUST) is also noted (double arrow). A po ten im. 
proteolytic cleavage site (see text) after ammo auu 
K Glul356 is arrowed. 
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low amino acid sequence identity separated 
conserved or semi-conserved regions [11,24] 
We have identified thirteen distinct regions of 
Pv200 that are typified by their amino acid 
sequence identity levels between the Sal- 1 and 
n i!m .trains (Fie 3). Although not fully 

heir junctions [11.24], the existence of th 
totinct pattern suggests the possibly tha 
Pv^OO diversity could be generated by intra 
£„fc recombination of a limited numb- o 
illeles as is the case with Pfl95 [24,28]. Ut 
further note, the carboxy-termmal region of 
the Sal-1 strain Pv200, equivalent to P 42 ot f 
tlclparum, is encoded to a large degree by 2 o 
the highly conserved gene segments (1 1 and I J) 
and exhibits 91% amino acid sequence iden my 
with the corresponding region of the Belem 
strain protein. 

Expression of Pv200 in yeast. Two domains 
ofpv200 were selected for expression studies 
%\o T) The first, designated Pv200A, includes 
amino acids 20-462 of the Pv200 precursor 
Pv SoA represents, therefore, a protein of 
approx. 49 kDa from the am.no-term.nal 



region of Pv200. A similarly expressed region 
nf Pfl95 can elicit good immunologica 
responses against native PfJ95 in mice and 
rabbits (these authors, S.P. Chang, G.S. Hu, 
unpublished observations). Furthermore th, 
region of PH95, expressed in bacteria, has been 
shown previously to induce partial protection 
TaoJs monkeys that were subjected o blood 
stage challenge with P. falciparum [5]. Tht 
second domain that was 
containing amino acids "57-1729 ot h 
PV 700 precursor, is the homolog of h, 
carboxy terminal fragment of that has 

been structurally defined as p42 [27]. T 
protein is of considerable importance since ti c 
Srboxy-terminus of the major merozo-te 
su face antigen has been implicated in the 
induction of a protective 
against P. falciparum infection [26,29]. Addi 
tfonal studies on this P. falciparum antigen 
have indicated that secreted recombinant p4_. 
from insect cells, is recognized by conforma- 
tion-dependent antibodies [3]. 

Each gene construct was generated by PCR 
[15] and expressed in the yeast S. cerevmw 
Pv200A was produced at particularly high 
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levels (F lg . 4A, lane 3) and represented 81% of 
the msoluble fraction, after lysis in a nonionic 
detergent-containing buffer. This protein could 
be purified to > 95% by a one step purification 
scheme. Pv200B was expressed at moderate 
levels (<20% of total yeast Triton X-100 
buffer insoluble fraction), but could be purified 
to greater than 85% using a 2-step purification 
procedure. The purified Pv200A and Pv200B 



proteins were shown to recognize antibodies in 
sera from individuals with a previous history of 
P. vivax infection [19]. Pooled human sera 
from such individuals, who were positive in 
their responses to the P. vivax CS protein (see 
below), was shown to further react with the 
recombinant merozoite surface antigens, by 
immunoblot analysis (Fig. 4B). 
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Seroreactivity of recombinant Pv200 anti- 
gens. Pv200A and Pv200B were used to 
analyze sera from Saimiri monkeys that were 
infected previously with the Sal- 1 strain of P. 
vivax by sporozoite challenge [8,9]. Somewhat 
surprisingly, the ELISA titers from these 
animals were low and were not indicative of 
disease state. In only 2 animals did we observe 
high titer responses, and in each case this was 
against Pv200B (data not shown). The reasons 
for this absence of high titer responses are 
currently unknown. However, these relatively 
low overall titers may reflect the fact that this 
was the primary challenge of a group of 
monkeys that had not been exposed previous- 
ly to P. vivax malaria. In contrast, we observed 
relatively high ELISA titers in a human 
population that had been subject to a single 
outbreak of vivax malaria (Fig. 5). ELISA 
titers against both Pv200A and Pv200B were 
above control values, as were titers against the 
recombinant CS protein (Vivax2) [30]. Notice- 
ably, titers against Pv200B were consistently 
higher than those against Pv200A. The most 
direct explanation for this is that the carboxy- 
terminal region of Pv200 is simply more 




Fig. 5. ELISA titers of human sera. Sera from individuals 
with overt infections, in which parasitemia was detected by 
thick blood smears, and who were treated with oral 
chloroquine (numbers 1-16). Three individuals were also 
included as controls, one positive (number 17) who had 
had multiple P. vivax and P. falciparum infections, with 
titers of around 655000 and 24000 for Pv200A and B 
respectively, and 2 negative (numbers 18 and 19) who were 
never exposed to P. vivax infections. Individuals in each 
group were assayed for reactivity against the P. vivax CS 
protein using a recombinant CS protein (Vivax2) [30] as 
well as against Pv200 A and B. 



immunogenic than the amino-terminal do- 
main, at least in a primary infection in 
humans. Alternatively, Pv200B might possess 
greater conformational integrity than Pv200A 
when compared with the corresponding re- 
gions of the native proteins, and might thus be 
more antigenic in the ELISA format. A third 
explanation is that higher titers against Pv200B 
could reflect the more conserved structure of 
this region. For example, the region of the Sal- 
1 Pv200 protein defined by our Pv200A 
molecule shares 76% amino acid sequence 
identity with the corresponding region of the 
Belem strain Pv200 protein whereas, as men- 
tioned above, the carboxy-terminal regions, 
corresponding to Pv200B, are 91% identical. In 
conclusion, we have demonstrated that recom- 
binant Pv200 proteins produced in recombi- 
nant yeast are able to recognize antibodies in 
infected monkeys and humans. The production 
of these proteins in large quantities will allow 
further and more detailed studies of their 
antigenicity, and their potential use as diag- 
nostic reagents. In addition, their ability to 
elicit protective immune responses in experi- 
mental animals can now be evaluated. 
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DNA sequence of the native (gp1 90n) and of the synthetic gene (gp1 90s) for gp1 90 
from FCG-1 
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gpl90 n 
gpl90° 



HKIIFFLCSFLFFIINTQCVTH 
C A XT A T TA A A T A A 



S Y 
ACT 



Q E 
A A 



C CCACGCGZ ATGAAAATCATTTTCTTCCTCTGTTCA 



AS 

gp!90 n 
gpl90 3 

AS 

gpl90 n 
gpl90 B 

AS 

gpl90 n 
Spl90 a 

AS 

gpl90" 
gpl90 a 

AS 

gpl90" 
gpl90 a 

AS 

gpl90 n 
gpI90 3 

AS 

gpl90 n 
gpl90 a 



LVKKLEALE 

T C A A A 

CTC^rrTAAGAAACTCCAAGCTT! 



AVLTGYSLFQKEK« V ^ NEG T 
AATGATTTTATA AA AT A AA 

ATtXICCTCI^ACCC^TACACCCTCTTCC^^ 



SG^AV--ST?GSKGSVASGCSGCSVASGGS 

a" AT TT T A C T A T TCA TA C jA^T TAT C A 

AGTCCCACGGCCCTTACAACCAGCACACCCGGTTCTJU^^ 

VASGGSVASGGSVASGCSGNSRRTNPSDNS 
t - A T TCA T T T TTCA T AT TTCAA CTA T^ATTA 

SDSDAKSYAOLXHRVRNYLLTlXELXYPQL 
X ATTAT TTTAAA AC TCTGTA AACATTACC 

rDLTKHKLTLCDHIHCrXYLXDGYEEI NEL 

T — A ' TATT T T TA TAT AT 

TTCtlACCTCACTAATCAIA^^ 

0LLRAKLNDVCAND7CQIPFHL 

T AT A A T A TATT T AT CT 

CGTTTGCCCCAATGACTAITGTCAAATTCCATTCAATTTC 



X L H F Y F 
A A C T T 



L Y 
T A 
CTGa 

X I R A N E L D V L X X L ' V F C Y R X P L D N I X 0 N V C X 
A ^-. ATAA CTAACTC AAA AT A ..A. A A 

aaoatcacacccju^gttc^ 




gp!90 n 
gpX90 3 

AS 

gp!90 n 
gp!90 a 



HATKESSKKXLYQA-QYOtSIYMKQLESA BM 

*. A A A A A A ATATTTTTCT AT A A 

^XCCAAC^WU^AGGAAGAAA^ 



T A A 



L E X R I 0 
TT A A A T T 



LKKNENIXELLDKIHSIKH 
TT AAA C T G TATT A A 

CCTTCAAGAAGAATCAAAATAT^^ 

X E I X E I 
A A A AT 




AS 

gpl90 n 
gpl90 fl 

AS 

qpl90 a 
gpl90 3 

AS 

gpl90 a 
gpl90 a 

AS 

gpl90 n 
gpl90 a 

AS 

gpl9Q n 
gp!90 a 

AS 

gpl90" 
gp!90 a 



AKTIKFNIDSLF 

t attt ag t a 
gccaaaaccattaagttcaacatacat: 



DPI.ELEY YLREKHX HI 

A ATAA TA A A A TT 

TACTCATCCCCTTGAGCTGCACTJ^^ 

I N 



ISAXVETXESTEPNEYPMGVTYPI.S YKO 
AAC- ACTA TC AATTTTA 

N A L K E L N S F G 0 L I N P F D Y T X E P S X H I Y ^T N 
T A *• A " TCT T T A TAT A AAC A C A - . T 

E R K K r j «, E X K E K I K I E X K * I E S 0 K K S Y E D R 
A A A C A T T AAT A A A A ATCTATC 



S T S L N 0 I T R S S t X L I. S B ^1 JT 0 S_ X F > _N " T ' x ° ^ / r 

IILaa^tctju^c^at^aaac^tcaaaa^ 

KrEKMMCKRYSYKVEKLTHHNTrASTE NSK 

T A A "* A T T t t 

AACTTCCAGAAAATCATt^IAAA^ 



27 
90 

57 

180 

37 

270 

117 

360 

147 

450 

177 

540 

207 

630 

237 

720 

267 

310 

297 

900 

327 

990 

357 

1080 

387 

1170 

417 

12 50 

447 

1350 

477 

1440 

507 

1530 

S3 7 

1620 
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A5HNLEKLTKALKYHEDYSLRMIVVEKELKYY 56? 
gpl90 n A T A A A AT TAATAATA T A T 

gp 1 9 0 0 CATAATCTTCvVCAAGCTXIACCAAAGCTCTTAAGTATATGCAG^ 1710 

ASXNLISKIENEIETLVE NIKKDEEQLFEKXI 597 
gpl90 n ATA CAA T TAATA AT A' A CT AAA 

gp 1 9 0 8 AACAATCTCAT AAGTAAGATC GAAAAC GAGATC GAGAC CCTTCTTCJVGAACATTAAGAACOATCAAjGAACIAjG TTGTTTCAGAAGAAGATT 1 a 0 0 

ASTKDENKPDEKILEVSDIVKVQVQKV1.LMNK 627 
gpi90 n T AATTAAATC A A TAATTATA A 

gp 1 9 0 3 ACAAAACAC CAAAATAAAC CAGATCAGAAGATC CTGCAGCTCTC C CATATTCTTAAAC TCCAACTGCAI1AACCTGCTC CTCATCAACAAC 18 90 

ASIDELXKTQLILKNVELXHNIHVPHSYKQEM 657 
gpl90 n CTAA T G TAA T A A TCTCCAAA 

gp 1 9 0 3 ATTCJVTCJ^CTCAAGAAGACTCAACTCATTCTCV^^ I960 

ASXQEPYY LIVLKKEIOXLXVFHPXVESI.INE 687 
gp!90 a A TTTTATCTCA ATT T A TCA ATCAT A T 

gpl90 3 AACCAC^^UUZ^TACTACCTCATCC^CTCAAO 2070 

AS E X X N IKTEGQSD N S EPSTSCE I TGQATTXP 717 
gpl90 n AAA A A TAG TGAAAC A A AATAAT 

gpl90 3 CAGAACAACAACAXTAAAACTCAAGGACACTCAC^ 2160 

ASGQQAGSALEGOSVQAQAQEQKQAQPPVPVP 747 
gpl90 a A A A T T A A TCA A A A A A A A A A A A 

gp!90 3 CCAjCAACACGCCCCTTCACCTCTCGAACCCC 2250 

ASVPEAXAQVPTPPAPVNNKTENVSXLDYLEX 777 
gpl90 n A A A A CA AAA TATA TTC T A T- T A A 

gp!9 0 3 CTTCCACAGCCTAAACCTCAAGTCCCTACA^ 2340 

A5I.YirLHTSYICHKYILVSaSTMNEKII.KQY 307 
gp!90 n T A ATTA TATAT TTGTA TCA A AT A AT 

gp 1 9 0 3 CTCTATCACTTTCCTGAATJVCATCCTACA^ 24 30 

AS X I T X E E E S X L S S C DPI, DLL FN IQN H I PVMY 83 7 
gpl90 n ATA GAAC T AAGT A TATAT T A A T ATA 

gp 1 9 0 3 AAI^TAACCAAGGAAGAGJACAGTAAACTCTCCT C r^ 2520 

ASSMFDSLMMSLSQL"HEIYSKEMVCSI.YKLK 867 
gpl90 n T " T A AG TAA AT AT AAA TTTTA TG 

*gpl90 3 TCTAT^TTTCCATAGCCTCAAjCAAXTC^^ 2610 . 

ASDNDKIKNLLEEAXKVSTSVKTLSSSSMQPL 897 
gpl90 n TT A ATTATA G A A A A A T AACTTCA A T A 

gp!90 3 GACAACCACAAGAITAAGAAC !■ I'lC I G CAGGAAGCTAAGAAGGTCTCCAC L. 1 * C PC L ' .^ AAAA CT L 1"- it. „ iCCAGCTCOara^AC:^CTG 2700 

ASSLTPQDKPEVSANOOTSESTKLNNSLKLFE 927 
gp!90 n AT A GTA ATATTTAATTA TT G TACTT ATA A 

gpl90 a rCTCTCACACCTCAACACAAGCCCCAACT<w^ 2790 

AS N I L SLGK NKNIYQEL I G Q X S S EHFYE X I L X 9S7 
gp!90 n AT AG T A A C A T A TAATA ACTAGT A T T A T A 

gp!90 3 AACATCCTCrCT C rCGCCAAGAAXAAGAACATCTACCAA^ 2880 

ASDSDTFYHESFTNFVKSKADDINSLNDESKR 987 
gpl90 n T T T T T ATCT TATTA T TTATGT A AG 

gpl90 3 CACAGCGACACATTCTAXAACCAGJU^TTTCACTAACT 2970 

AS X X L E E 0 I NKLXKTLQLS FDLYNKYKLKLER 1017 
gpl90 A AX A ATT AT AAA TT A GT ATCA T TT A T T A T TA TAA 

gpi90 3 AACAACCTC^Utf^X^ACATCAATAACCT^^ 3060 

) 

ASLFDXXXTVGKYKMQrKKLTLLKEQLESKLN 1047 
gpL90^ TATTA A TTA A A TAACT TATAAAATA TCA T G T 

gp 1 9 0 3 CTCTTCCACAAGAACAACACACICCCCAACTATAACAX^^ 3 1 S 0 

A5SLNNP XH VLQMF SVFFNK'XKEAEIAETENT 1077 
gpl90 n TTC ACT TAA T T T ' T AAA T A A A A T A A 

gp 1 9 0 3 TCACTCAACAATC CCAAACAC GTACTGCACAACTTC 32 4 0 

ASLEKTXILLKEYKGLVKYYNCESSPLXTL S- E 1107 
gpl90 n TAA A A AT AT G TT ATTA TAA AT A A T AAGT A 

gp 1 9 0 B CTGCAGAACACCAACA | : , : > L 1UX2 TCAAACACTACAAAGCCCTC GTCAACTJCTTA JCCTCTCJLACJtfrrCTCTCCCAG 3330 

ASE S IQTE 0 NYAS L E N FXVLS K L E G X L ■ X D N LM 1137 
gpl90 n ATCA T A A A TT TT A A T A AT AAG AT A A AT A TTTAT 

gpl9Q* C^CAGCATCCACACCCJ^GCATAACTACGCC^^ 3420 
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ASLEJCKXLSYI.SSCLBHI.IAELKEVI XNKNYT 1167 
gpl90 n T A A A AT ATCA TAAT TA TT ATT AT A A AAATATTA 

gpl90 a CTCGAGAAGAAGAACCTCACCTACCTCTCTA^ 35 10 

ASGNSPSENNTDVNNALESYICXFLPECTDVA.T 1197 
gpl90 n T TCT T A GTTCTTA AATCA T A A 

gp 1 9 0 a C^^AATACCCCAACCCAGAAXAATAC^^^ 3 6 0 0 

ASVVSESCSDTLEQSQPKKPASTHVGAESNTZ 1227 
gpl90 n T AAG AG A T A A AAC A AAA AATCA 

gpl90 a CT<^7rCTCKIAATCTC<^TCCGACACA^ 2 690 

ASTTSQNVDOEVDDVIIVPIFGESZEDYDDLC .12 57 
gp!90 n A A A T T AA A A A ATA ATC A A T T TT A A 

gp!90 a ACCACAXCTCAGAACGTCGA^ 37 8 0 

ASQVV TCEAVTPSVIDN I LS KlENEYEVL YLK 12 97 
<jpl9Q n . AAAAAAAA A ATT TAT TCTTATA 

gpl9 0 3 CAGCTGGTCACC G<rrGACCCTGTCAC I C C ri C C GTCATTGATAACATTCTGTC CAAAATCGACAACCAAT ACGAAG T fl C TC T ATCTGAAA 337 0 

ASPLACVYRSLKKQL-E'NNVMTFNVNVKD ILNS 1317 
gp!90 n T A T T AAC T A A AT A A T AT TTT TTA TTCA 

gpl90° CCTCTCCCAGCCCrCTATACCTCl^^^ 3 9 SO 

ASRFNKRENrKNVLESDLIPYKDLTSSNYVVX 1347 
gpi90 n A AC T A T T A ATCA T A A TT A A AAC T T A 

gpl90 a CGCTTTAATAACAGACAAAA l. ' l ' 1C AACAAC G 1C 1 TGGACAGC GAj l. 1 ' 1\^ TTCCCTATAAAGACCTGACCTCCTCT AACTAC ** 11A AC 4050 

ASDPYKrLNICEXR*DKFLSSYNYIKDSIDTOIN 1377 
gp!90 n T TATT AAA CT AAGC TT T TAATG A 

gpl9 0 fl CACCCATACAAGTTCCTCAATAAA£^^^ 414 0 

ASFANDVLGY. YKILSE KYXSDLDSIXKYINDX 1407 
gpl90 n TA TTA T A AT ATC T A A TT ATA A CA 

gpl90 fl TTCGCTAATCATC^TCCTCGGCTATTAC^ 4230 

ASQCENEKYLPFLNNIETL YKTVNDKIDL FVI 1437 
gpl90 n TA G CT TT ACTTG TATA TTT TTTAT 

gpl90 fl CAAGGCCACAATCAAAAATJerCTCCCCTTC 4320 

AS HL-EAKVLMYTYE ICSNV2VKIKELNY L.'K T I Q 1467 
gp!90 n ' TTAAAATAT AT ATCA CA AAATTTA T 

gp 1 9 0 a CACCTtXlAGCCCAACCT CC TCAACTA^ 4 410 

ASDXLADFKXNNNFVGIADL5TDYNBMNLI.TX 14 97 
gpl90 n AT TA T T TT AAA T T CT AT A 

gp 19 0 a GACAAGCTCOCAGATTTCAAGAAAAATAACA^ 4500 

ASFLSTGMVFENLAXTVLSMLLDGNLQGMLNI 1527 
gpl90 n C TACT AT TT TTT C TT ATCT TATTAT AT T A T 

gpl9 0 s TTTCTCTCCACTGGCATGCTGTTCGLAAAACCTCGCCAAA^ 45 9 0 



ASSQHQCVKXQCPQNSGCFRELDEREECXCLL 1.557 
gpl90 n A A AA A T A A TCT A A TATAAA ATA TATA 

gp 1 9 0 a TCCCAGCACCAATCCGTCAAGJ^^ 1 ' 1 ' IC AGGCATCTCCACCACCGCGAACAGTCCAAC 1C IC IMLTG- 4 630 



ASNYXQEGDKCVEHPHPTCNENNGGCDAOAKC 1537 
gpl90 n T T ATTAT T TTC T TA TA C T 

gpl90* AACTACAAACJLACAAGCAGATAA^ 4 77 0 

ASTiEBSGSNGKKrTCECTKPDSYPLFDGIFC 1617 
gpl90 n A TTCA TACC TA ATTT T TT C 

gp 1 9 0 s ACCGACGAAGACACCCGCTCTAACC^IAAACAA^ 4 8 60 

ASS S SN FLG'T FFLL I L M I* I LYSF I • • 1639 
gpi90 n AGTTC C T A A A CA T AT A A T A AT A T T 

gp 1 9 0 3 Teg^?CTAArr?rCTmarAT L L ' ltL i - ^ ~rrj~.3^r^a'w-^a^r^^ 4 9 4 0 
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